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DESCRIPTION 



USPA1 AND USPA2 ANTIGENS OF MOILAXELLA CATARRHALIS 



BACKGROUND OF THE INVENTION 



I. 



Field of the Invention 



The present invention relates generally to the fields of microbiology, and clinical 
bacteriology. More particularly, it concerns sequences of the uspAl and uspA2 genes which 
encode the proteins UspAl and UspA2. respectively, both of which encode an epitope reactive 
with monoclonal antibody (MAb) 17C7 and provide useful epitopes for immunodiagnosis and 
immunoprophylaxis, 

II. Description of Related Art 

It was previously thought that Moraxella caiarrhalis, previously known as Branhamella 
caiarrhalis or Neisseria catarrhalis, was a harmless saprophyte of the upper respiratory tract 
(Catlin, 1990; Berk, 1990). However, during the previous decade, it has been determined that 
this organism is an important human pathogen. Indeed, it has been established that this Gram- 
negative diplococcus is the cause of a number of human infections (Murphy, 1989). M. 
catarrhalis is now known to be the third most common cause of both acute and chronic otitis 
media (Catlin, 1990; Faden et al.. 1990; 1991; Marchant, 1990), the most common disease for 
which infants and children receive health care according to the 1989 Consensus Report. This 
organism also causes acute maxillary sinusitis, generalized infections of the lower respiratory 
tract (Murphy and Loeb, 1989) and is an important cause of bronchopulmonary infections in 
patients with underlying chronic lung disease and, less frequently, of systemic infections in 
immunocompromised patients (Melendez and Johnson, 1990; Sarubbi et al, 1990; Schonheyder 
and Ejlertsen, 1989; Wright and Wallace, 1989). 

The 1989 Consensus Report further concluded that prevention of otitis media is an 
important health care goal due to both its occurrence in infants and children, as well as certain 
populations of all age groups. In fact, the total financial burden of otitis media has been 
estimated to be at least $2.5 billion annually. Vaccines were identified as the most desired 
approach to prevent this disease for a number of reasons. For example, it was estimated that if 
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vaccines could reduce the incidence of otitis media by 30%, then the annual health care savings 
would be at least $400 million. However, while some progress has been made in the 
development of vaccines for 2 of the 3 common otitis media pathogens, Streptococcus 
pneumoniae and Haemophilus influenzae, there is no indication that similar progress has been 
made vyith respect to M. catarrhalis. This is particularly troublesome in that M catarrhalis 
now accounts for approximately 17-20% of all otitis media infection (Murphy, 1989). In 
addition, M. catarrhalis is also a significant cause of sinusitis (van Cauwenberge et at., 1993) 
and persistent cough (Gottfarb and Brauner, 1994) in children. In the elderly, it infects patients 
with predisposing conditions such as chronic obstructive pulmonary disease (COPD) and other 
chronic cardiopulmonary conditions (Boyle et al, 1991 ; Davies and Maesen, 1988; Hager et al. 9 



Despite its recognized virulence potential, little is known about the mechanisms 
employed by M catarrhalis in the production of disease or about host factors governing 
immunity to this pathogen. An antibody response to M. catarrhalis otitis media has been 
documented by means of an ELISA system using whole M. catarrhalis cells as antigen and 
acute and convalescent sera or middle ear fluid as the source of antibody (Leinonen et at, 
1981). The development of serum bactericidal antibody during M. catarrhalis infection in 
adults was shown to be dependent on the classical complement pathway (Chapman et at, 
1985). And more recently, it was reported that young children with M. catarrhalis otitis media 
develop an antibody response in the middle ear but fail to develop a systemic antibody response 
in a uniform manner (Faden et at., 1992). 

Previous attempts have been made to identify and characterize M. catarrhalis antigens 
that would serve as potentially important targets of the human immune response to infection 
(Murphy, 1989; Goldblatt et a/., 1990; Murphy et at., 1990). Generally speaking, the surface of 
M catarrhalis is composed of outer membrane proteins (OMPs), lipooligosaccharide (LOS) 
and fimbriae. M. catarrhalis appears to be somewhat distinct from other Gram-negative 
bacteria in that attempts to isolate the outer membrane of this organism using detergent 
fractionation of cell envelopes has generally proven to be unsuccessful in that the procedures 
did not yield consistent results (Murphy, 1989; Murphy and Loeb, 1989). Moreover, 
preparations were found to be contaminated with cytoplasmic membranes, suggesting an 
unusual characteristic of the M. catarrhalis cell envelope. 



1987). 
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Passive immunization with polyclonal antisera raised against outer membrane vesicles 
of the M. catarrhal is strain 03 5 E was also found to protect against pulmonary challenge by the 
heterologous M catarrhalis strain TTA24. In addition, active immunization with M 
catcirrhalis outer membrane vesicles resulted in enhanced clearance of this organism from the 
lungs after challenge. The positive effect of immunization in pulmonary clearance indicates 
that antibodies play a major role in immunoprotection from this pathogen. In addition, the 
protection observed against pulmonary challenge with a heterologous M. catarrhalis strain 
demonstrates that one or more conserved surface antigens are targets for antibodies which 
function to enhance clearance of M. catarrhalis from the lungs. 

Outer membrane proteins (OMPs) constitute major antigenic determinants of this 
unencapsulated organism (Bartos and Murphy, 1988) and different strains share remarkably 
similar OMP profiles (Bartos and Murphy, 1988; Murphy and Bartos, 1989). At least three 
different surface-exposed outer membrane antigens have been shown to be well-conserved among 
M. catarrhalis strains; these include the 81 kDa CopB OMP (Helminen et al., 1993b), the heat- 
modifiable CD OMP (Murphy et al., 1993) and the high-molecular weight UspA antigen 
(Helminen et al, 1994). Of these three antigens, both the CopB protein and UspA antigen have 
been shown to bind antibodies which exert biological activity against M. catarrhalis in an animal 
model (Helminen et ai t 1994; Murphy et al, 1993). 

The MAb, designated 17C7, was described as binding to UspA, a very high molecular 
weight protein that migrated with an apparent molecular weight (in SDS-PAGE) of at least 250 
kDa (Helminen et al., 1994; Klingman and Murphy, 1994). MAb 17C7 enhanced pulmonary 
clearance of M. catarrhalis from the lungs of mice when used in passive immunization studies 
and, in colony blot radioimmunoassay analysis, bound to every isolate of M catarrhalis 
examined. This same MAb also reacted, although less intensely, with another antigen band of 
approximately 100 kDa, as described in U.S. Patent No. 5,552,146 (incorporated herein by 
reference). A recombinant bacteriophage that contained a fragment of M. catarrhalis 
chromosomal DNA that expressed a protein product that bound MAb 17C7 was also identified 
and migrated at a rate similar or indistinguishable from that of the native UspA antigen from M 
catarrhalis (Helminen et al. , 1 994). 
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With the rising importance of this pathogen in respiratory tract infections, identification 
of the surface components of this bacterium involved in virulence expression and immunity is 
becoming more important. To date, there are no vaccines available, against any other OMP, 
LOS or fimbriae, that induce protective antibodies against M. catarrhalis. Thus, it is clear that 
there remains a need to identify and characterize useful antigens and which can be employed in 
the preparation of immunoprophylactic reagents. Additionally, once such an antigen or 
antigens is identified, there is a need for providing methods and compositions which will allow 
the preparation of vaccines and in quantities that will allow their use on a wide scale basis in 
prophylactic protocols. 

SUMMARY OF THE INVENTION 

It is, therefore, an object of the present invention to provide new UspAl and UspA2 
proteins and genes coding therefor. It also is an object of the present invention to provide 
methods of using these new proteins, for example, in the preparation of agents for the treatment 
and inhibition of M. catarrhalis infection. It also is contemplated that through the use of other 
technologies such as antibody treatment and immunoprophylaxis that one can inhibit or even 
prevent M. catarrhalis infections. 

In satisfying these goals, there are provided epitopic core sequences of UspAl and 
UspA2 which can serve as the basis for the preparation of therapeutic or prophylactic 
compositions or vaccines which comprise peptides of 7, 10, 20, 30, 40, 50 or even 60 amino 
acids in length that elicit an antigenic reaction and a pharmaceutically acceptable buffer or 
diluent. These peptides may be coupled to a carrier, adjuvant, another peptide or other 
molecule such that an effective antigenic response to M catarrhalis is retained or even 
enhanced. Alternatively, these peptides may act as carriers themselves when coupled to another 
peptide or other molecule that elicits an antigenic response to M. catarrhalis or another 
pathogen. For example, UspA2 can serve as a carrier for an oligosaccharide. 

In one embodiment, the epitopic core sequences of UspAl and UspA2 comprise one or 
more isolated peptides of 7, 10, 20, 30, 40, 50 or even 60 amino acids in length having the 
amino acid sequence AQQQDQH (SEQ ID NO: 1 7). 



SUBSTITUTE SHEET (RULE 25) 



WO 98/28333 



PCT/US97/23930 



5 



In another embodiment, there are provided nucleic acids, itspAl and uspA2. which 
encode the UspAI and the UspA2 antigens, respectively, as well as the amino acid sequences of 
the UspAI and UspA2 antigens of the M catarrhalis isolates 035E, TTA24, TTA37, and 
046E. It is envisioned that nucleic acid segments and fragments of the genes uspAl and nspA2 
and the UspAI and UspA2 antigens will be of value in the preparation and use of therapeutic or 
prophylactic compositions or vaccines for treating, inhibiting or even preventing M. catarrhalis 
infections. 

In another embodiment, there is provided a method for inducing an immune response in 
a mammal comprising the step of providing to the mammal an antigenic composition that 
comprises an isolated peptide of about 20 to about 60 amino acids that contains the identified 
epitopic core sequence and a pharmaceutically acceptable buffer or diluent. 

In another embodiment, there is provided a method for diagnosing M. catarrhalis 
infection which comprises the step of determining the presence, in a sample, of an M. 
catarrhalis amino acid sequence corresponding to residues of the epitopic core sequences of 
either the UspAI or UspA2 antigen. This method may comprise PCR ™ detection of the 
nucleotide sequences or alternatively an immunologic reactivity of an antibody to either a 
Usp A 1 or UspA2 antigen. 

In a further embodiment, there is provided a method for treating an individual having an 
M catarrhalis infection which comprises providing to the individual an isolated peptide of 
about 20 to about 60 amino acids that comprises at least about 10 consecutive residues of the 
amino acid sequence identified as an epitopic core sequence of UspAI or UspA2. 

In a still further embodiment, there is provided a method for preventing or limiting an 
M catarrhalis infection that comprises providing to a subject an antibody that reacts 
immunologically with the identified epitopic core region of either UspAI or UspA2 of M. 
catarrhalis. 

In another embodiment, there is provided a method for screening a peptide for reactivity 
with an antibody that binds immunologically to UspAI, UspA2 or both which comprises the 
steps of providing the peptide and contacting the peptide with the antibody and then 
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determining the binding of the antibody to the peptide. This method may comprise an 
immunoassay such as a western blot, an ELISA, an RIA or an immunoaffinity separation. 

In a still further embodiment, there is provided a method for screening a UspAl or 
UspA2 peptide for its ability to induce a protective immune response against M catarrhalis by 
5 providing the peptide, administering it in a suitable form to an experimental animal, challenging 
the animal with M. catarrhalis and then assaying for an M catarrhalis infection in the animal. 
It is envisioned that the animal used will be a mouse that is challenged by a pulmonary 
exposure to M catarrhalis and that the assaying comprises assessing the degree of pulmonary 
clearance by the mouse. 

10 Other objects, features and advantages of the present invention will become apparent 

from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various changes and modifications within the spirit 
and scope of the invention will become apparent to those skilled in the art from this detailed 

15 description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings form part of the present specification and are included to further 
20 demonstrate certain aspects of the present invention. The invention may be better understood 
by reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

FIG. 1. Southern blot analysis of A>wII-digested chromosomal DNA from strains of M. 
25 catarrhalis using a probe from the uspAl gene. Bacterial strain designations are at the top; 
kilobase (kb) position markers are on the left. 

FIG. 2 A. Proteins present in whole cell lysates of the wild-type strain 035E and the 
isogenic uspAl mutant strain. These proteins were resolved by SDS-PAGE and stained with 
30 Coomassie blue. The left lane (WT) contains the wild-type strain and right lane (MUT) contains 
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the mutant. The arrows indicate the protein, approximately 120 kDa in size, that is present in the 
wild-type and missing in the mutant. Kilodalton position markers are on the left. 

FIG. 2B. Western blot analysis of whole cell lysatesof the wild-type strain 035E and the 
isogenic uspAl mutant strain. These proteins were resolved by SDS-PAGE and probed with MAb 
17C7 in western blot analysis. The left lane (WT) contains the wild-type strain and the right lane 
(MUT) contains the mutant. Kilodalton position markers are on the left. It can been seen that 
both strains possess the very high molecular weight band reactive with MAb 1 7C7 whereas only 
the wild-type strain also has a band of approximately 1 20 kDa that binds this MAb. 

FIG. 2C. Western blot analysis of whole cell lysate (WCL) and EDTA-extracted outer 
membrane vesicles (OMV) from the wild-type strain 035E (WT) and the isogenic uspAl mutant 
(MUT) using MAb 1 7C7. Samples were either heated at 37°C for 15 minutes (H) or at 100°C for 
5 minutes (B) prior to SDS-PAGE. Molecular weight position markers (in kilodaltons) are 
indicated on the left. The open arrow indicates the position of the very high molecular weight 
form of the MAb 17C7-reactive antigen; the closed arrow indicates the position of the 
approximately 120 kDa protein; the open circle indicates the position of the approximately 70-80 
kDa protein. 

FIG. 3. Southern blot analysis of chromosomal DNA from the wild-type M. catarrhalis 
strain 035E and the isogenic uspAl mutant. Chromosomal DNA was digested with Pvull and 
probed with a 0.6 kb BgHl-Pvull fragment from the uspAl gene. The wild-type strain is listed as 
035E at the top of this figure and the mutant strain is listed as 035E-uspAl\ Kilobase position 
markers are present on the left side. 

FIG. 4. Western blot reactivity of proteins in M. catarrhalis strain 035E outer membrane 
vesicles (labeled 035E OMV) and the MF-4-1 GST fusion protein (labeled GST fusion protein) . 
with MAb 17C7. 

FIG. 5. PGR™ products obtained by the use of the T3 and P 1 0 primers (middle lane - 0.9 
kb product) and the T7 and P9 primers (right lane - 1.7 kb product) when used in a PCR™ 
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amplification with chromosomal DNA from the uspAl mutant. A kb ladder is present in the first 
lane; several kb position markers are listed on the left side of this figure. 

FIG. 6A-6C. SDS-PAGE and westerns of purified proteins, FIG. 6A. Coomassie blue 
stained gel of purified UspA2 (lane 2). FIG. 6B. Coomassie blue stained gel of purified UspAl 
prepared without heating of sample (lane 4), heated for 3 min at 100°C (lane 5), heated for 5 
min at 100°C (lane 6), and heated for 10 min at 100°C (lane 7). FIG. 6C. Western of the 
purified UspA2 (lane 9) and purified UspAl (lane 10) probed with the 17C7 MAb. Both 
proteins were heated 10 min. The molecular size markers in lanes 1,3, and 8 are as indicated in 
kilodaltons. 

FIG. 7. Interaction of purified UspAl and UspA2 with HEp-2 cells as determined by 
ELISA. HEp-2 cell monolayers cultured in 96-well plate were incubated with serially diluted 
UspAl or UspA2. 035E bacterial strain was used as the positive control. The bacteria were 
diluted analogous to the proteins beginning with a suspension with an A 550 of 1.0. The bound 
proteins or attached bacteria were detected with a 1:1 mixed antisera to UspAl and UspA2 as 
described in the methods. 

FIG. 8. Interaction with fibronectin and vitronectin determined by dot blot. The bound 
vitronectin was detected with rabbit polyclonal antibodies, the protein bound to the fibronectin 
was detected with pooled sera made against the UspAl and UspA2. 

FIG. 9. The levels of antibodies to the protein UspAl, UspA2 and M catarrhalis 035E 
strain in normal human sera. Data are the log 10 transformed end-point titers of the IgG (FIGs. 
9A-9C) and IgA (FIGs. 9D-9F) antibodies determined by ELISA. The individual titers were 
plotted according to age group and the geometric mean titer for each age group linked by a solid 
line. Sera for the 2-18 month old children were consecutive samples from a group of ten 
children. 

FIG. 10. Subclass distribution of IgG antibodies to UspAl and UspA2 in normal 
human sera. FIG. 10A shows titers toward UspAl and FIG. 10B shows titers to UspA2. 
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FIG. 11. Relationship of serum IgG titers to UspAl (FIG. 1 1 A) and UspA2 (FIG. 1 IB) 
with the bactericidal liter against the 035E strain determined by logistic regression (p<0.05). 
The solid line indicates the linear relationship between the IgG titer and bactericidal titer. 
Broken lines represent the 95 % confidence intervals of the linear fit. 

5 

FIG. 12. Schematic drawing showing the relative positions of decapeptides 10-24 
within the region of UspAl and UspA2 which binds to MAb 17C7. 

FIG. 13. Western dot blot analysis demonstrating reactivity of decapeptides 10-24 with 
MAb 17C7. 

10 FIG. 14. Partial restriction enzyme map of the mpAl (FIG. 14A) and uspA2 (FIG. 14B) 

genes from M. catarrhalis strain 035E and the mutated versions of these genes. The shaded 
boxes indicate the open reading frame of each gene. Relevant restriction sites are indicated. 
PCR™ primer sites (P1-P6) are indicated by arrows. The DNA fragments containing the partial 
uspAl and uspA2 open reading frames that were derived from M. catarrhalis strain 035E 
15 chromosomal DNA by PCR™ and cloned into pBluescriptll SK+ are indicated by black bars. 
Dotted lines connect corresponding restriction sites on these DNA inserts and the chromosome. 
Open bars indicate the location of the kanamycin or chloramphenicol cassettes, respectively. 
The DNA probes specific for uspAl or uspA2 are indicated by the appropriate cross-hatched 
bars and were amplified by PCR™ from M catarrhalis strain 035E chromosomal DNA by the 
20 use of the oligonucleotide primer pairs 

P3 (5'-GACGCTCAACAGCACTAATACG-3') (SEQ ID NO:20)/P4 

(5'-CCAAGCTGATATCACTACC-3') (SEQ ID NO:21) and 

P5 (5'-TCAATGCCTTTGATGGTC-3') (SEQ ID NO:22)/P6 

(5'-TGTATGCCGCTACTCGCAGCT-3') (SEQ ID NO:23), respectively. 

25 

FIG. 15. Detection of the UspAl and UspA2 proteins in wild-type and mutant strains 
of M. catarrhalis 035E. Proteins present in EDTA-extracted outer membrane vesicles from the 
wild-type strain (lane 1), the uspAl mutant strain 035E.I (lane 2), the uspA2 mutant strain 
035E.2 (lane 3), and the isogenic^yp/17 uspA2 double mutant strain 035E.12 (lane 4) were 
30 resolved by SDS-PAGE, and either stained with Coomassie blue (FIG. 15 A) or transferred to 
nitrocellulose and probed with MAb I7C7 followed by radioiodinated goat anti-mouse 
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immunoglobulin in western blot analysis. In FIG. 15 A, the closed arrow indicates the very high 
molecular weight form of the UspA antigen which is comprised of both UspAl and UspA2. In 
FIG. 15B, the bracket on the left indicates the very high molecular weight forms of the UspAl 
and UspA2 proteins that bind MAb 17C7. The open arrow indicates the 120 kDa, putative 
5 monomeric form of UspA L The closed arrow indicates the 85 kDa, putative monomeric form 
of UspA2. Molecular weight position markers (in kilodaltons) are present on the left. 

FIG. 16. Comparison of the rate and extent of growth of the wild-type and mutant 
strains of M. catarrhalis. The wild-type strain 035E (closed squares), the uspAl mutant 
10 035E.1 (open squares), the uspA2 mutant 035E.2 (closed circles), and the uspAl uspA2 double 
mutant 035E.12 (open circles) of M. catarrhalis 035E from overnight broth cultures were 
diluted to a density of 35 Klett units in BHI broth and subsequently allowed to grow at 37° with 
shaking. Growth was followed by means of turbidity measurements. 

15 FIG, 17. Susceptibility of wild-type and mutant strains of M. catarrhalis to killing by 

normal human serum. Cells of the wild-type parent strain 03 5 E. (diamonds), uspAl mutant 
035E.1 (triangles), uspAl mutant 035E.2 (circles), and uspAJ uspA2 double mutant 035E.12 
(squares) from logarithmic-phase BHI broth cultures were incubated in the presence of 10% 
(v/v) normal human serum (closed symbols) or heat-inactivated normal human serum (open 

20 symbols). Data are presented as the percentage of the original inoculum remaining at each time 
point. 



DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 



25 The present invention relates to the identification of epitopes useful for developing 

potential vaccines against M. catarrhalis. Early work was directed at determining the 
molecular nature of the UspA antigen and characterize the epitope which is recognized by the 
MAb 17C7. Preliminary work indicated that MAb 17C7 recognizes a single antigenic epitope 
and it was believed that this epitope was encoded by a single gene. However, isolation of the 

30 protein which contained the epitope yielded unexpected results. MAb 17C7 recognized a single 
epitope, but the characteristics of the protein associated with the epitope suggested the existence 
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of not one but two separate proteins. Further careful analyses led to a surprising discovery. A 
single epitope of the UspA antigen is recognized by the MAb 17C7, but this epitope is present 
in two different proteins, UspAl and UspA2, which are encoded by two different genes uspAI 
and uspAl, respectively, and only have 43% identity to each other. The present invention 
5 provides the nucleotide sequences of the genes uspAJ and uspA2, their respective protein 
products, UspAl and UspA2, and the shared epitope recognized by MAb 17C7. 

In addition, the present invention provides insights into the antigenic structure of the 
UspA protein based on the analysis of the sequences of the UspAl and UspA2 proteins which 
comprise the protein. Characterization of the epitopic region of the molecule that is targeted by 
10 the MAb 17C7 permits the development of agents that will be useful in protecting against M. 
catarrhalis infections, e.g., in the preparation of prophylactic reagents. Particular embodiments 
relate to the amino acid and nucleic acids corresponding to the UspAl and UspA2 proteins, 
peptides and antigenic compositions derived therefrom, and methods for the diagnosis and 
treatment of M catarrhalis disease. 

15 As stated previously, M. catarrhalis infections present a serious health challenge, 

especially to the young. Thus, there is a clear need to develop compositions and methods that 
will aid in the treatment and diagnosis of this disease. The present invention, by virtue of new 
information regarding the structure of the UspA antigen of M catarrhalis, and discovery of the 
two new and distinct proteins UspAl and UspA2 provides such improved compositions and 

20 methods. UspAl and UspA2 represent important antigenic determinants, as the MAb 17C7 has 
been shown to protect experimental animals, as measured in a pulmonary clearance model, 
when provided in passive immunizations. 

In a first embodiment, the present invention provides for the identification of the 
proteins UspAl and UspA2 from M catarrhalis strain 03 5E. The UspAl protein comprises 
25 about 831 amino acid residues and has a predicted mass of about 88,271 daltons (SEQ ID 
NO:l). The UspA2 protein comprises about 576 residues and has a predicted mass of about 
62,483 daltons (SEQ ID NO:3). UspA2 is not a truncated or processed form of UspAl . 

In a second embodiment, the present invention has identified the specific epitope to 
which MAb 17C7 binds. A common peptide sequence, designated as the "3Q" peptide, found 
30 between amino acid residues 480-502 and 582-604 of the UspAl protein (SEQ ID NO:l) and 

SUBSTITUTE SHEET (RULE 25) 

BNSDOCID: <WO 9828333 A2_1_> 



WO 98/28333 




PCT/US97/23930 



residues 355-377 of the UspA2 protein (SEQ ID NO:3) of M. catarrhal strain 035E, 
encompasses the region which appears to be recognized by M Ab 1 7C7. (Note that numbering 
of the amino acid residues is based upon strain 035E as provided in SEQ ID NO:3.) It is 
envisioned that this region plays an important role in the biology of the pathogen and, from this 
5 information, one will deduce amino acids residues that are critical in MAb 17C7 antibody 
binding. It also is envisioned that, based upon this information, one will be able to design 
epitopic regions that have either a higher or lower affinity for the MAb 17C7 or other 
antibodies. Further embodiments of the present invention are discussed below. 

In another preferred embodiment, the present invention provides DNA segments, 
10 vectors and the like comprising at least one isolated gene, DNA segment or coding region that 
encodes a M catarrhalis UspAl or UspA2 protein, polypeptide, domain, peptide or any fusion 
protein thereof. Herein are provided at least an isolated gene, DNA segment or coding region 
that encodes a M. catarrhalis uspAl gene comprising about 2493 base pairs (bp) (SEQ ID 
NO:2) of strain 035E, about 3381 bp (SEQ ID NO:6) of strain 046E, about 3538 bp (SEQ ID 
15 NO: 1 0) of strain TTA24, or about 3292 bp (SEQ ID NO: 1 4) of strain TTA37. Further provided 
are at least an isolated gene, DNA segment or coding region that encodes a M catarrhalis 
uspAl gene comprising about 1728 bp (SEQ ID NO:4) of strain 035E, about 3295 bp (SEQ ID 
NO:8) of strain 046E, about 2673 bp (SEQ ID NO:12), or about 4228 bp (SEQ ID NO:16) of 
strain TTA37. It is envisioned that the uspAl and uspA2 genes will be useful in the preparation 
20 of proteins, antibodies, screening assays for potential candidate drugs and the like to treat or 
inhibit, or even prevent, M. catarrhalis infections. 

The present invention also provides for the use of the UspAl or UspA2 proteins or 
peptides as immunogenic carriers of other agents which are useful for the treatment, inhibition 
or even prevention of other bacterial, viral or parasitic infections. It is envisioned that either the 

25 UspAl or UspA2 antigen, or portions thereof, will be coupled, bonded, bound, conjugated or 
chemically-linked to one or more agents via linkers, polylinkers or derivatized amino acids such 
that a bispecific or multivalent composition or vaccine which is useful for the treatment, 
inhibition or even prevention of infection by M catarrhalis and another pathogen(s) is 
prepared. It is further envisioned that the methods used in the preparation of these compositions 

30 will be familiar to those of skill in the art and, for example, similar to those used to prepare 
conjugates to keyhole limpet hemocyannin (KLH) or bovine serum albumin (BSA). 
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It is important to note that screening methods for diagnosis and prophylaxis are readily 
available, as set forth below. Thus, the ability to (i) test peptides, mutant peptides and 
antibodies for their reactivity with each other and (ii) test peptides and antibodies for the ability 
to prevent infections in vivo, provide powerful tools to develop clinically important reagents. 

1.0 UspA Proteins, Peptides and Polypeptides 

The present invention, in one embodiment, encompasses the two new protein sequences, 
UspAl and UspA2, and the peptide sequence AQQQDQH (SEQ ID NO: 17) identified as the 
target epitope of MAb 17C7. In addition, inspection of the amino acid sequences of the UspAl 
and UspA2 proteins from four strains of M catarrhalis indicated that each protein contained at 
least one copy of the peptide YEL AQQQDQH (SEQ ID NO: 18) which binds Mab 17C7 or, in 
one instance, a peptide nearly identical and having the amino acid sequence YDLAQQQDQH 
(SEQIDNO:'l9). 

The peptide (YELAQQQDQH, SEQ ID NO: 18) occurs twice in UspAl from strain 
035E at residues 486-495 and 588-597 (SEQ ID NO:l) and once in UspA2 from strain 035E at 
residues 358-367 (SEQ ID NO:3). It occurs once in UspAl from strain TTA24 at residues 497- 
506 (SEQ ID NO:9) and twice in UspA2 from strain TTA24 at residues 225-234 and 413-422 
(SEQ ID NO:l 1). The peptide YDLAQQQDQH (SEQ ID NO: 19) occurs once in UspAl from 
strain 046E at residues 448-457 (SEQ ID NO:5) whereas the peptide YELAQQQDQH (SEQ 
ID NO: 18) occurs once in this same protein at residues 649-658 (SEQ ID NO:5). The peptide 
YELAQQQDQH (SEQ ID NO: 18) occurs once in UspA2 from strain 046E at residues 416-425 
(SEQ ID NO:7). The peptide YELAQQQDQH (SEQ ID NO: 18) occurs twice in UspAl from 
strain TTA37 at residues 478-487 and 630-639 (SEQ ID NO: 13) and twice in UspA2 from 
strain TTA37 at residues 522-53 1 and 68 1 -690 (SEQ ID NO: 1 5). 

Also encompassed in the present invention are hybrid molecules containing portions 
from one UspA protein, for example the UspAl protein, fused with portions of the other UspA 
protein, in this example the UspA2 protein, or fused with other proteins which are useful for 
identification, such as kanamycin-resistance, or other purposes in the screening of potential 
vaccines or further characterization of the UspAl and UspA2 proteins. For example, one may 
fuse residues 1-350 of any UspAl with residues 351-576 of any UspA2. Alternatively, a fusion 
could be generated with sequences from three, four or even five peptide regions represented in a 
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single UspA antigen. Also encompassed are fragments of the disclosed UspAl and UspA2 
molecules, as well as insertion, deletion or replacement mutants in which non-UspA sequences 
are introduced, UspA sequences are removed, or UspA sequences are replaced with non-UspA 
sequences, respectively. 

5 UspAl and UspA2 proteins, according to the present invention, may be advantageously 

cleaved into fragments for use in further structural or functional analysis, or in the generation of 
reagents such as UspA-related polypeptides and UspA-specific antibodies. This can be 
accomplished by treating purified or unpurified UspAl and/or UspA2 with a peptidase such as 
endoproteinase glu-C (Boehringer, Indianapolis, IN). Treatment with CNBr is another method by 
10 which UspAl and/or UspA2 fragments may be produced from their natural respective proteins. 
Recombinant techniques also can be used to produce specific fragments of UspAl or UspA2. 

More subtle modifications and changes may be made in the structure of the encoded 
UspAl or UspA2 polypeptides of the present invention and still obtain a molecule that encodes 
a protein or peptide with characteristics of the natural UspA antigen. The following is a 
1 5 discussion based upon changing the amino acids of a protein to create an equivalent, or even an 
improved, second-generation molecule. The amino acid changes may be achieved by changing 
the codons of the DNA sequence, according to the following codon table: 
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TABLE I 



Amino acid names and 
abbreviations 



Codons 



Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 


Cysteine 


Cys 


C 


UGC 


UGU 






Aspartic acid 


Asp 


D 


GAC 


GAU 






Glutamic acid 


Glu 


E 


GAA 


GAG 






Phenylalanine 


Phe 


F 


UUC 


uuu 






Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 


Histidine 


His 


H 


CAC 


CAU 






Isoleucine 


lie 


1 


A T r A 

AU A 


AUC 


AUU 




Lysine 


Lys 


K 


AAA 


AAG 






T piinnp 


f pn 


T 


T FT TA 

U /A. 


uuu 


PI I A 


pi IP 


K/fpthinninp 


Mpt 


IVf 


at in 








A <?nji ra o i n p 


A <;n 

rvjii 


N 


A AC 


A AH 






Proline 


Pro 


P 


CCA 


ccc 


CCG 

V_/ V_I 


PPl I 

V_- \J 


Glutamine 


Gin 


Q 


CAA 


CAG 






Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


Serine 


Ser 


s 


AGC 


AGU 


UCA 


UCC 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 


Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 


Tryptophan 


Trp 


W 


UGG 








Tyrosine 


Tyr 


Y 


UAC 


UAU 







CUG CUU 



UCG UCU 



It is known that certain amino acids may be substituted for other amino acids in a 
protein structure in order to modify or improve its antigenic or immunogenic activity (see, e.g., 
Kyte & Doolittle, 1982; Hopp, U.S. patent 4,554,101, incorporated herein by reference). For 
example, through the substitution of alternative amino acids, small conformational changes may 
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be conferred upon a polypeptide which result in increased activity or stability. Alternatively, 
amino acid substitutions in certain polypeptides may be utilized to provide residues which may 
then be linked to other molecules to provide peptide-molecule conjugates which retain enough 
antigenicity of the starting peptide to be useful for other purposes. For example, a selected 
5 UspAl or UspA2 peptide bound to a solid support might be constructed which would have 
particular advantages in diagnostic embodiments. 

The importance of the hydropathic index of amino acids in conferring interactive 
biological function on a protein has been discussed generally by Kyte & Doolittle (1982), 
wherein it is found that certain amino acids may be substituted for other amino acids having a 

10 similar hydropathic index or core and still retain a similar biological activity. As displayed in 
Table II below, amino acids are assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics. It is believed that the relative hydropathic character 
of the amino acid determines the secondary structure of the resultant protein, which in turn 
defines the interaction of the protein with substrate molecules. Preferred substitutions which 

15 result in an antigenically equivalent peptide or protein will generally involve amino acids 
having index scores within ±2 units of one another, and more preferably within ±1 unit, and 
even more preferably, within ±0.5 units. 

TABLE II 

Amino Acid Hydropathic Index 

Isoleucine 4.5 
Valine 4.2 
Leucine 3.8 
Phenylalanine 2.8 
Cysteine/cystine 2.5 
Methionine 1.9 
Alanine 1.8 
Glycine -0.4 
Threonine -0.7 
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Table II (Continued) 



Amino Acid 


Hydropathic Index 


Tryptophan 


-0.9 


Serine 


-0.8 


Tyrosine 


-1.3 


Proline 


-1.6 


Histidine 


-3.2 


Glutamic Acid 


-3.5 


Glutamine 


-3.5 


Aspartic Acid 


-3.5 


Asparagine 


-3.5 


Lysine 


-3.9 


Arginine 


-4.5 



Thus, for example, isoleueine, which has a hydropathic index of +4.5, will preferably be 
exchanged with an amino acid such as valine (+ 4.2) or leucine (+ 3.8). Alternatively, at the 
5 other end of the scale, lysine (- 3.9) will preferably be substituted for arginine (-4.5), and so on. 

Substitution of like amino acids may also be made on the basis of hydrophilicity, 
particularly where the biological functional equivalent protein or peptide thereby created is 
intended for use in immunological embodiments. U.S. Patent 4,554,101, incorporated herein by 
reference, states that the greatest local average hydrophilicity of a protein, as governed by the 
10 hydrophilicity of its adjacent amino acids, correlates with its immunogenicity and antigenicity, 
i.e. with an important biological property of the protein. 

As detailed in U.S. Patent 4,554,101, each amino acid has also been assigned a 
hydrophilicity value. These values are detailed below in Table III. 
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TABLE III 



Amino Acid 


Hydrophilic Index 


arginine 


+3.0 


lysine 


+3.0 


aspartate 


+3.0 ± 1 


giutamate 


+3.0 ± I 


serine 


+0.3 


asparagine 


+0.2 


glutamine 


+0.2 


glycine 


0 


threonine 


-0.4 


alanine 


-0.5 


histidine 


-0.5 


proline 


-0.5 dh 1 


cysteine 


-1.0 


methionine 


-1.3 


valine 


-1 5 


leucine 


-1.8 


isoleucine 


-1.8 


tyrosine 


-2.3 


phenylalanine 


-2.5 


tryptophan 


-3.4 



It is understood that one amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent, and in particular, an 
immunologically equivalent protein. In such changes, the substitution of amino acids whose 
hydrophilicity values are within ±2 is preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more particularly preferred. 

Accordingly, these amino acid substitutions are generally based on the relative similarity 
of R-group substituents, for example, in terms of size, electrophilic character, charge, and the 
like. In general, preferred substitutions which take various of the foregoing characteristics into 
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consideration will be known to those of skill in the art and include, for example, the following 
combinations: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine 
and asparagine; and valine, leucine and isoleucine. 

In addition, peptides derived from these polypeptides, including peptides of at least 
5 about 6 consecutive amino acids from these sequences, are contemplated. Alternatively, such 
peptides may comprise about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 
50, 51, 52, 53, 54, 55, 56, 57, 58, 59 or 60 consecutive residues. For example, a peptide that 
comprises 6 consecutive amino acid residues may comprise residues 1 to 6, 2 to 7, 3 to 8 and so 
10 on of the UspAl or UspA2 protein. Such peptides may be represented by the formula 

x to (x + n) = 5' to 3 1 the positions of the first and last consecutive residues 

where x is equal to any number from 1 to the full length of a UspAl or UspA2 protein and n is 
equal to the length of the peptide minus 1. So, for UspAl, x = 1 to 831, for UspA2, x = 1 to 
576. Where the peptide is 10 residues long (n = 10-1), the formula represents every 10-mer 
15 possible for each antigen. For example, where x is equal to 1 the peptide would comprise 
residues 1 to (1 + [10-1]), or 1 to 10. Where x is equal to 2, the peptide would comprise 
residues 2 to (2 + [10-2]), or 2 to 11, and so on. 

Syntheses of peptides are readily achieved using conventional synthetic techniques such 
as the solid phase method (e.g., through the use of a commercially available peptide synthesizer 
20 such as an Applied Biosystems Model 430A Peptide Synthesizer). Peptides synthesized in this 
manner may then be aliquoted in predetermined amounts and stored in conventional manners, 
such as in aqueous solutions or, even more preferably, in a powder or lyophilized state pending 
use. 

In general, due to the relative stability of peptides, they may be readily stored in aqueous 
25 solutions for fairly long periods of time if desired, c.^.. up to six months or more, in virtually 
any aqueous solution without appreciable degradation or loss of antigenic activity. However, 
where extended aqueous storage is contemplated it will generally be desirable to include agents 
including buffers such as Tris or phosphate buffers to maintain a pH of 7.0 to 7.5. Moreover, it 
may be desirable to include agents which will inhibit microbial growth, such as sodium azide or 
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Merthiolate. For extended storage in an aqueous state it will be desirable to store the solutions 
at 4°C, or more preferably, frozen. Of course, where the peptide(s) are stored in a lyophilized 
or powdered state, they may be stored virtually indefinitely, e.g.. in metered aliquots that may 
be rehydrated with a predetermined amount of water (preferably distilled, deionized) or buffer 
prior to use. 

Of particular interest are peptides that represent epitopes that lie within the UspA 
antigen and are encompassed by the UspAl and UspA2 proteins of the present invention. An 
"epitope" is a region of a molecule that stimulates a response from a T-cell or B-cell, and hence, 
elicits an immune response from these cells. An epitopic core sequence, as used herein, is a 
relatively short stretch of amino acids that is structurally "complementary " to, and therefore will 
bind to, binding sites on antibodies or T-cell receptors. It will be understood that, in the context 
of the present disclosure, the term "complementary" refers to amino acids or peptides that 
exhibit an attractive force towards each other. Thus, certain epitopic core sequences of the 
present invention may be operationally defined in terms of their ability to compete with or 
perhaps displace the binding of the corresponding UspA antigen to the corresponding UspA- 
directed antisera. 

The identification of epitopic core sequences is known to those of skill in the art. For 
example U.S. Patent 4,554,101 teaches identification and preparation of epitopes from amino 
acid sequences on the basis of hydrophilicity, and by Chou-Fasman analyses. Numerous 
computer programs are available for use in predicting antigenic portions of proteins, examples 
of which include those programs based upon Jameson- Wolf analyses (Jameson and Wolf, 1988; 
Wolf et aL, 1988), the program PepPlot© (Brutlag et al., 1990; Weinberger et al, 1985), and 
other new programs for protein tertiary structure prediction (Fetrow & Bryant, 1993) that can be 
used in conjunction with computerized peptide sequence analysis programs. 

In general, the size of the polypeptide antigen is not believed to be particularly crucial, 
so long as it is at least large enough to carry the identified core sequence or sequences. The 
smallest useful core sequence anticipated by the present disclosure would be on the order of 
about 6 amino acids in length. Thus, this size will generally correspond to the smallest peptide 
antigens prepared in accordance with the invention. However, the size of the antigen may be 
larger where desired, so long as it contains a basic epitopic core sequence. 
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2.0 UspAi and UspA2 Nucleic Acids 

In addition to polypeptides, the present invention also encompasses nucleic acids 
encoding the UspAl (SEQ ID NO:2, SEQ ID NO:6, SEQ ID NO:10 and SEQ ID NO:14) and 
UspA2 (SEQ ID NO:4, SEQ ID NO:8, SEQ ID NO: 12 and SEQ ID NO: 16) proteins from the 
5 exemplary M. catarrhal™ strains 035E, 046E, TTA24 and TTA37, respectively. Because of 
the degeneracy of the genetic code, many other nucleic acids also may encode a given UspAl or 
UspA2 protein. For example, four different three-base codons encode the amino acids alanine, 
glycine, proline, threonine and valine, while six different codons encode arginine, leucine and 
serine. Only methionine and tryptophan are encoded by a single codon. Table I provides a list of 

10 amino acids and their corresponding codons for use in such embodiments. In order to generate 
any nucleic acid encoding UspAl or UspA2, one need only refer to the codon table provided 
herein. Substitution of the natural codon with any codon encoding the same amino acid will result 
in a distinct nucleic acid that encodes UspAl or UspA2. As a practical matter, this can be 
accomplished by site-directed mutagenesis of an existing uspAl or uspA2 gene or de novo 

1 5 chemical synthesis of one or more nucleic acids. 

These observations regarding codon selection, site-directed mutagenesis and chemical 
synthesis apply with equal force to the discussion of substitutional mutant UspA 1 or UspA2 
peptides and polypeptides, as set forth above. More specifically, substitutional mutants generated 
by site-directed changes in the nucleic acid sequence that are designed to alter one or more codons 

20 of a given polypeptide or epitope may provide a more convenient way of generating large 
numbers of mutants in a rapid fashion. The nucleic acids of the present invention provide for a 
simple way to generate fragments (e.g., truncations) of UspAl or UspA2, UspAl-UspA2 fusion 
molecules (discussed above) and UspAl or UspA2 fusions with other molecules. For example, 
utilization of restriction enzymes and nuclease in the uspAl or uspA2 gene permits one to 

25 manipulate the structure of these genes, and the resulting gene products. 

The nucleic acid sequence information provided by the present disclosure also allows 
for the preparation of relatively short DNA (or RNA) sequences that have the ability to 
specifically hybridize to gene sequences of the selected uspAl or uspAl gene. In these aspects 
nucleic acid probes of an appropriate length are prepared based on a consideration of the coding 
30 sequence of the uspAl or uspA2 gene, or flanking regions near the uspAl or uspA2 gene, such as 
regions downstream and upstream in the M. catarrhalis chromosome. The ability of such 
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nucleic acid probes to specifically hybridize to either uspAl or itsp/12 gene sequences lends 
them particular utility in a variety of embodiments. For example, the probes can be used in a 
variety of diagnostic assays for detecting the presence of pathogenic organisms in a given 
sample. In addition, these oligonucleotides can be inserted, in frame, into expression constructs 
5 for the purpose of screening the corresponding peptides for reactivity with existing antibodies 
or for the ability to generate diagnostic or therapeutic reagents. 

To provide certain of the advantages in accordance with the invention, the preferred 
nucleic acid sequence employed for hybridization studies or assays includes sequences that are 
complementary to at least a 10 to 20, or so, nucleotide stretch of the sequence, although 

10 sequences of 30 to 60 or so nucleotides are also envisioned to be useful. A size of at least 9 
nucleotides in length helps to ensure that the fragment will be of sufficient length to form a 
duplex molecule that is both stable and selective. Though molecules having complementary 
sequences over stretches greater than 1 0 bases in length are generally preferred, in order to 
increase stability and selectivity of the hybrid, and thereby improve the quality and degree of 

15 the specific hybrid molecules obtained. Thus, one will generally prefer to design nucleic acid 
molecules having either uspA l or uspA2 gene-complementary stretches of 15 to 20 nucleotides, 
or even longer, such as 30 to 60, where desired. Such fragments may be readily prepared by, 
for example, directly synthesizing the fragment by chemical means, by application of nucleic 
acid reproduction technology, such as the PCR™ technology of U.S. Patent 4,603,102, or by 

20 introducing selected sequences into recombinant vectors for recombinant production. 

The probes that would be useful may be derived from any portion of the sequences of SEQ 
ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID 
NO: 1 2 or SEQ ID NO: 1 4 or SEQ ID NO: 1 6. Therefore, probes are specifically contemplated that 
comprise nucleotides 1 to 9, or 2 to 10, or 3 to 11 and so forth up to a probe comprising the last 9 

25 nucleotides of the nucleotide sequence of SEQ ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or 
SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID NO: 12 or SEQ ID NO: 14 or SEQ ID NO: 16. Thus, 
each probe would comprise at least about 9 linear nucleotides of the nucleotide sequence of SEQ 
ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID 
NO: 12 or SEQ ID NO: 14 or SEQ ID NO: 16., designated by the formula M n to n + 8," where n is 

30 an integer from 1 to the number of nucleotides in the sequence. Longer probes that hybridize to. 
the usp At or uspA2 gene under low, medium, medium-high and high stringency conditions are 
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also contemplated, including those that comprise the entire nucleotide sequence of SEQ ID NO:2 
or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 1 0 or SEQ ID NO: 1 2 or SEQ 
ID NO: 14 or SEQ ID NO: 16. This hypothetical may be repeated for probes having lengths of 
about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 and 
greater bases. 

In that the UspA antigenic epitopes of the present invention are believed to be indicative 
of pathogenic Moraxella species as exemplified by strains 035E, 046E, TTA24 and TTA37, 
the probes of the present invention will find particular utility as the basis for diagnostic 
hybridization assays for detecting UspAl or UspA2 DNA in clinical samples. Exemplary 
clinical samples that can be used in the diagnosis of infections are thus any samples which 
could possibly include Moraxella nucleic acid, including middle ear fluid, sputum, mucus, 
bronchoalveolar fluid, amniotic fluid or the like. A variety of hybridization techniques and 
systems are known which can be used in connection with the hybridization aspects of the 
invention, including diagnostic assays such as those described in Falkow et al, U.S. Patent 
4,358,535. Depending on the application envisioned, one will desire to employ varying 
conditions of hybridization to achieve varying degrees of selectivity of the probe toward the 
target sequence. For applications requiring a high degree of selectivity, one will typically desire 
to employ relatively stringent conditions to form the hybrids, for example, one will select 
relatively low salt and/or high temperature conditions, such as provided by 0.02M-0.15M NaCl 
at temperatures of 50°C to 70°C. These conditions are particularly selective, and tolerate little, 
if any, mismatch between the probe and the template or target strand. 

Of course, for some applications, for example, where one desires to prepare mutants 
employing a mutant primer strand hybridized to an underlying template, less stringent 
hybridization conditions are called for in order to allow formation of the heteroduplex. In these 
circumstances, one would desire to employ conditions such as 0.15M-0.9M salt, at temperatures 
ranging from 20°C to 55°C. In any case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing amounts of formamide, which serves to 
destabilize the hybrid duplex in the same manner as increased temperature. Thus, hybridization 
conditions can be readily manipulated, and the method of choice will generally depend on the 
desired results. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



In certain embodiments, one may desire to employ nucleic acid probes to isolate variants 
from clone banks containing mutated clones. In particular embodiments, mutant clone colonies 
growing on solid media which contain variants of the UspAl and/or UspA2 sequence could be 
identified on duplicate filters using hybridization conditions and methods, such as those used in 
5 colony blot assays, to obtain hybridization only between probes containing sequence variants 
and nucleic acid sequence variants contained in specific colonies. In this manner, small 
hybridization probes containing short variant sequences of either the uspAl or nspA2 gene may 
be utilized to identify those clones growing on solid media which contain sequence variants of 
the entire uspAl or uspA2 gene. These clones can then be grown to obtain desired quantities of 
10 the variant UspAl or UspA2 nucleic acid sequences or the corresponding UspA antigen. 

In clinical diagnostic embodiments, nucleic acid sequences of the present invention are 
used in combination with an appropriate means, such as a label, for determining hybridization. 
A wide variety of appropriate indicator means are known in the art, including radioactive, 
enzymatic or other ligands, such as avidin/biotin, which are capable of giving a detectable 

15 signal. In preferred diagnostic embodiments, one will likely desire to employ an enzyme tag 
such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other 
environmental undesirable reagents. In the case of enzyme tags, colorimetric indicator 
substrates are known which can be employed to provide a means visible to the human eye or 
spectrophotometrically, to identify specific hybridization with pathogen nucleic acid-containing 

20 samples. 

In general, it is envisioned that the hybridization probes described herein will be useful 
both as reagents in solution hybridizations as well as in embodiments employing a solid phase. 
In embodiments involving a solid phase, the test DNA (or RNA) from suspected clinical 
samples, such as exudates, body fluids (e.g., amniotic fluid, middle ear effusion, 

25 bronchoalveolar lavage fluid) or even tissues, is adsorbed or otherwise affixed to a selected 
matrix or surface. This fixed, single-stranded nucleic acid is then subjected to specific 
hybridization with selected probes under desired conditions. The selected conditions will 
depend on the particular circumstances based on the particular criteria required (depending, for 
example, on the G+C contents, type of target nucleic acid, source of nucleic acid, size of 

30 hybridization probe, etc.). Following washing of the hybridized surface so as to remove 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



nonspecifically bound probe molecules, specific hybridization is detected, or even quantified, 
by means of the label. 

The nucleic acid sequences which encode for the UspAl and/or UspA2 epitopes, or their 
variants, may be useful in conjunction with PCR™ methodology to detect M catarrhalis. In 
5 general, by applying the PCR™ technology as set out, e.g., in U.S. Patent 4,603,102, one may 
utilize various portions of either the uspAl or uspA2 sequence as oligonucleotide probes for the 
PCR™ amplification of a defined portion of a uspAl or uspA2 nucleic acid in a sample. The 
amplified portion of the uspAl or uspA2 sequence may then be detected by hybridization with a 
hybridization probe containing a complementary sequence. In this manner, extremely small 
10 concentrations of M catarrhalis nucleic acid may detected in a sample utilizing uspAl or uspA2 
sequences. 

3.0 Vectors, Host Cells and Cultures for Producing UspAl and/or UspA2 Antigens 

In order to express a UspAl and/or UspA2 polypeptide, it is necessary to provide an 
uspAl and/or uspA2 gene in an expression cassette. The expression cassette contains a UspAl 

15 and/or UspA2-encoding nucleic acid under transcriptional control of a promoter. A "promoter" 
refers to a DNA sequence recognized by the synthetic machinery of the cell, or introduced 
synthetic machinery, required to initiate the specific transcription of a gene. The phrase "under 
transcriptional control" means that the promoter is in the correct location and orientation in 
relation to the nucleic acid to control RNA polymerase initiation and expression of the gene. 

20 Those promoters most commonly used in prokaryotic recombinant DNA construction include 
the B-lactamase (penicillinase) and lactose promoter systems (Chang et al., 1978; Itakura et al. ? 
1977; Goeddel et al., 1979) and a tryptophan (trp) promoter system (Goeddel et al, 1980; EPO 
Appl. Publ. No. 0036776). While these are the most commonly used, other microbial 
promoters have been discovered and utilized, and details concerning their nucleotide sequences 

25 have been published, enabling a skilled worker to ligate them functionally with plasmid vectors 
(EPO Appl. Publ. No. 0036776). Additional examples of useful promoters are provided in 
Table IV below. 
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TABLE IV 



Promoters 


References 


Immunoglobulin Heavy Chain 


Hanerji etaL, 1 983; Gilles etaL, 1983; 
Grosschedl and Baltimore, 1985; Atchinson 
and Perry, 1986, 1987; Imler et aL, 1987; 
Weinberger et aL, 1988; Kiledjian et aL, 1988; 
Porton etaL, 1990 


Immunoglobulin Light Chain 


Queen and Baltimore, 1983; Picard and 
Schaffner, 1984 


T-Cell Receptor 


Luria et aL, 1987, Winoto and Baltimore, 1989; 
Redondo etaL, 1990 


HLA DQ a and DQ B 


Sullivan and Peterlin, 1987 


fl-Interferon 


Goodbourn et aL, 1986; Fujita et aL, 1987; 
Goodbourn and Maniatis, 1985 


Interleukin-2 


Greene etaL, 1989 


Interleukin-2 Receptor 


Greene et aL, 1989; Lin et aL, 1990 


MHC Class II 5 


Koch etaL, 1989 


MHC Class II HLA-DRa 


Sherman etaL, 1989 


B-Actin 


Kawamoto et aL, 1988; Ng et aL, 1989 


Muscle Creatine Kinase 


Jaynes et aL, 1988; Horlick and Benfield, 1989; 
Johnson et aL, 1989a 


Prealbumin (Transthyretin) 


Costa et aL, 1988 


Elastase / 


Omitz etaL, 1987 


Metallothionein 


Karin et aL, 1987; Culotta and Hamer, 1989 


Collagenase 


Pinkert et aL, 1987; Angel et aL, 1987 


Albumin Gene 


Pinkerte/a/.. 1987, Tronche et aL, 1989, 1990 
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TABLE IV (Continued) 



Promoters 


References 


a-Fetoprotein 


Godbout et al., 1988; Campere and Tilghman, 
1989 


t-Globin 


Bodine and Ley. 1987; Perez-Stable and 
Constantini, 1990 


B-Globin 


Trudel and Constantini, 1 987 


e-fos 


Cohen et al., 1987 


c-HA-ras 


Triesman, 1986; Deschamps et al, 1985 


Insulin 


Edlunde/a/., 1985 


Neural Cell Adhesion Molecule 
(NCAM) 


Hirsch et al, 1990 


^1 -Antitrypain 


Latimer et al., 1990 


H2B (TH2B) Histone 


Hwang etai, 1990 


Mouse or Type I Collagen 


Ripe etal, 1989 


Glucose-Regulated Proteins 
(GRP94 and GRP78) 


Chang etai, 1989 


Rat Growth Hormone 


Larsen et al. , 1 986 


Human Serum Amyloid A (SAA) 


Edbrooke etai, 1989 


Troponin I (TN I) 


Yutzey et al.. 1989 


Platelet-Derived Growth Factor 


Pech etai, 1989 


Duchenne Muscular Dystrophy 


Klamute/a/., 1990 


SV40 


Banerji et al. , 1 98 1 ; Moreau et al. , 1 98 1 ; Sleigh 
and Lockett, 1985; Firak and Subramanian, 
1986; Herr and Clarke, 1986; Imbra and Karin, 
1986; Kadesch and Berg, 1986; Wang and 
Calame, 1986; Ondek et al., 1987; Kuhl et al., 
1987 Schaffner etai, 1988 
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TABLE IV (Continued) 



Promoters 


References 


Polyoma 


Swartzendruber and Lehman, 1975; Vasseur et 
al, 1*980; Katinka et al, 1980, 1981; Tyndell et 
al, 1981; Dandoloe/fl/., 1983; deVilliers et 
al , 1 984; Hen et al , 1 986; Satake et al , 1 988; 
Campbell and Villarreal, 1988 


Retroviruses 


Kriegler and Botchan, 1982, 1983; Levinson et 
al, 1982; Kriegler et al , 1983, 1984a,b, 1988; 
Bosze et al, 1986; Miksicek et al, 1986; 
Celander and Haseltine, 1987; Thiesen et al, 
1988; Celander e/a/., 1988; Choi et al, 1988; 
Reisman and Rotter, 1 989 


Papilloma Virus 


Campo 1983;Lusky etal 9 1983; 
Spandidos and Wilkie, 1983; Spalholz et al, 
1985; Lusky and Botchan. 1986; Cripe et al, 
1987; Gloss et al, 1987; Hirochika et al, 1987, 
Stephens and Hentschel, 1987; Glue et al, 
1988 


Hepatitis B Virus 


Bulla and Siddiqui, 1986; Jameel and Siddiqui, 
1986; Shaul and Ben-Levy, 1987; Spandauand 
Lee, 1988; Vannice and Levinson, 1988 


Human Immunodeficiency Virus 


Muesing et al, 1987; Hauber and Culian, 1988; 
Jakobovits et al, 1988; Feng and Holland, 
1988; Takebe et al, 1988; Rowen et al, 1988; 
Berkhout et al, 1989; Laspia et al, 1989; 
Sharp and Marciniak, 1989; Braddock et al, 
1989 


Cytomegalovirus 


Weber et al, 1984; Boshart et al, 1985; 
Foecking and Hofstetter, 1986 


Gibbon Ape Leukemia Virus 


Holbrook et al, 1987; Quinn et al. 1989 
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The appropriate expression cassette can be inserted into a commercially available 
expression vector by standard subclonirtg techniques. For example, the E. coli vectors pUC or 
pBluescript™ may be used according to the present invention to produce recombinant UspAl 
and/or UspA2 polypeptide//? vitro. The manipulation of these vectors is well known in the art. In 
general, plasmid vectors containing replicon and control sequences which are derived from 
species compatible with the host cell are used in connection with these hosts. The vector 
ordinarily carries a replication site, as well as marking sequences which are capable of 
providing phenotypic selection in transformed cells. For example, E. coli is typically 
transformed using pBR322, a plasmid derived from an E. coli species (Bolivar et aL, 1977). 
pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means 
for identifying transformed cells. The pBR pkismid, or other microbial plasmid or phage must 
also contain, or be modified to contain, promoters which can be used by the microbial organism., 
for expression of its own proteins. 

In addition, phage vectors containing replicon and control sequences that are compatible 
with the host microorganism can be used as a transforming vector in connection with these 
hosts. For example, the phage lambda GEM rM -ll may be utilized in making recombinant 
phage vector which can be used to transform host cells, such as E. coli LE392. 

In one embodiment, the UspA antigen is expressed as a fusion protein by using the 
pGEX4T-2 protein fusion system (Pharmacia LKB, Piscataway,NJ), allowing characterization of 
the UspA antigen as comprising both the UspAl and UspA2 proteins. Additional examples of 
fusion protein expression systems are the glutathione S-transferase system (Pharmacia, 
Piscataway, NJ), the maltose binding protein system (NEB, Beverley, MA), the FLAG system 
(IBI, New Haven, CT), and the 6xHis system (Qiagen, Chatsworth, CA). Some of these fusion 
systems produce recombinant protein bearing only a small number of additional amino acids, 
which are unlikely to affect the functional capacity of the recombinant protein. For example, both 
the FLAG system and the 6xHis system add only short sequences, both of which are known to be 
poorly antigenic and which do not adversely affect folding of the protein to its native 
conformation. Other fusion systems produce proteins where it is desirable to excise the fusion 
partner from the desired protein. In another embodiment, the fusion partner is linked to the 
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recombinant protein by a peptide sequence containing a specific recognition sequence for a 
protease. Examples of suitable sequences are those recognized by the Tobacco Etch Virus 
protease (Life Technologies, Gaithersburg, MD) or Factor Xa (New England Biolabs, Beverley, 
MA). 

E. coli is a preferred prokaryotic host. For example, E. coli strain RR1 is particularly 
useful. Other microbial strains which may be used include E. coli strains such as E. coli LE392, 
E. coli B, and E. coli X 1776 (ATCC No. 3 1537). The aforementioned strains, as well as E. coli 
W3110 (F-, lambda-, prototrophic, ATCC No. 273325), bacilli such as Bacillus suhtilis, or 
other enterobacteriaceae such as Salmonella typhimurium or Serratia marcescens, and various 
Pseudomonas species may be used. These examples are, of course, intended to be illustrative 
rather than limiting. Recombinant bacterial cells, for example E. coli, are grown in any of a 
number of suitable media, for example LB, and the expression of the recombinant polypeptide 
induced by adding IPTG to the media or switching incubation to a higher temperature. After 
culturing the bacteria for a further period of between 2 and 24 hours, the cells are collected by 
centrifugation and washed to remove residual media. The bacterial cells are then lysed, for 
example, by disruption in a cell homogenizer and centrifuged to separate the dense inclusion 
bodies and cell membranes from the soluble cell components. This centrifugation can be 
performed under conditions whereby the dense inclusion bodies are selectively enriched by 
incorporation of sugars such as sucrose into the buffer and centrifugation at a selective speed. 

If the recombinant protein is expressed in the inclusion bodies, as is the case in many 
instances, these can be washed in any of several solutions to remove some of the contaminating 
host proteins, then solubilized in solutions containing high concentrations of urea (e.g. 8M) or 
chaotropic agents such as guanidine hydrochloride in the presence of reducing agents such as B- 
mercaptoethanolor DTT (dithiothreitol). 

Under some circumstances, it may be advantageous to incubate the polypeptide for several 
hours under conditions suitable for the protein to undergo a refolding process into a conformation 
which more closely resembles that of the native protein. Such conditions generally include low 
protein concentrations less than 500 \xg/ml 9 low levels of reducing agent, concentrations of urea 
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less than 2 M and often the presence of reagents such as a mixture of reduced and oxidized 
glutathione which facilitate the interchangeof disulfide bonds within the protein molecule. 

The refolding process can be monitored, for example, by SDS-PAGE or with antibodies 
which are specific for the native molecule (which can be obtained from animals vaccinated with 
5 the native molecule isolated from bacteria). Following refolding, the protein can then be purified 
further and separated from the refolding mixture by chromatography on any of several supports 
including ion exchange resins, gel permeation resins or on a variety of affinity columns. 

There are a variety of other eukaryotic vectors that provide a suitable vehicle in which 
recombinant UspA proteins can be produced. In various embodiments of the invention, the 

10 expression construct may comprise a virus or engineered construct derived from a viral genome. 
The ability of certain viruses to enter cells via receptor-mediated endocytosis and to integrate into 
host cell genome and express viral genes stably and efficiently have made them attractive 
candidates for the transfer of foreign genes into mammalian cells (Ridgeway, 1988; Nicolas and 
Rubenstein, 1988; Baichwal and Sugden, 1986; Temin, 1986). The first viruses used as vectors 

15 were DNA viruses including the papovaviruses (simian virus 40 (SV40), bovine papilloma virus, 
and polyoma) (Ridgeway, 1988; Baichwal and Sugden, 1986) and adenoviruses (Ridgeway, 1988; 
Baichwal and Sugden, 1986) and adeno-associated viruses. Retroviruses also are attractive gene 
transfer vehicles (Nicolas and Rubenstein, 1988; Temin, 1986) as are vaccina virus (Ridgeway, 
1988) adeno-associated virus (Ridgeway, 1988) and herpes simplex virus (HSV) (Glorioso et ai, 

20 1995). Such vectors may be used to (i) transform cell lines in vitro for the purpose of expressing 
proteins of interest or (ii) to transform cells in vitro or in vivo to provide therapeutic polypeptides 
in a gene therapy scenario. 

With respect to eukaryotic vectors, the term promoter will be used here to refer to a group 
of transcriptional control modules that are clustered around the initiation site for RNA polymerase 
25 II. Much of the thinking about how promoters are organized derives from analyses of several viral 
promoters, including those for the HSV thymidine kinase (tk) and SV40 early transcription units. 
These studies, augmented by more recent work, have shown that promoters are composed of 
discrete functional modules, each consisting of approximately 7-20 bp of DNA, and containing 
one or more recognition sites for transcriptional activator or repressor proteins. 
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At least one module in each promoter functions to position the start site for RNA 
synthesis. The best known example of this is the TATA box, but in some promoters lacking a 
TATA box, such as the promoter for the mammalian terminal deoxynucleotidyl transferase gene 
and the promoter for the SV40 late genes, a discrete element overlying the start site itself helps to 
5 fix the place of initiation. 

Additional promoter elements regulate the frequency of transcriptional initiation. 
Typically, these are located in the region 30-1 10 bp upstream of the start site, although a number 
of promoters have recently been shown to contain functional elements downstream of the start site 
as well. The spacing between promoter elements frequently is flexible, so that promoter function 
10 is preserved when elements are inverted or moved relative to one another. In the tk promoter, the 
spacing between promoter elements can be increased to 50 bp apart before activity begins to 
decline. Depending on the promoter, it appears that individual elements can function either 
cooperatively or independently to activate transcription. 

The particular promoter that is employed to control the expression of a nucleic acid is not 
15 believed to be critical, so long as it is capable of expressing the nucleic acid in the targeted cell. 
Thus, where a human cell is targeted, it is preferable to position the nucleic acid coding region 
adjacent to and under the control of a promoter that is capable of being expressed in a human cell. 
Generally speaking, such a promoter might include either a human or viral promoter. Preferred 
promoters include those derived from HSV, including the a4 promoter. Another preferred 
20 embodiment is the tetracycline controlled promoter. 

In various other embodiments, the human cytomegalovirus (CMV) immediate early gene 
promoter, the SV40 early promoter and the Rous sarcoma virus long terminal repeat can be used 
to obtain high-level expression of transgenes. The use of other viral or mammalian cellular or 
bacterial phage promoters which are well-known in the art to achieve expression of a transgene is 
25 contemplated as well, provided that the levels of expression are sufficient for a given purpose. 
Table IV lists several promoters which may be employed, in the context of the present invention, 
to regulate the expression of a transgene. This list is not intended to be exhaustive of all the 
possible elements involved in the promotion of transgene expression but, merely, to be exemplary 
thereof. 
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Enhancers were originally detected as genetic elements that increased transcription from a 
promoter located at a distant position on the same molecule of DNA. This ability to act over a 
large distance had little precedent in classic studies of prokaryotic transcriptional regulation. 
Subsequent work showed that regions of DNA with enhancer activity are organized much like 
5 promoters. That is, they are composed of many individual elements, each of which binds to one 
or more transcriptional proteins. 

The basic distinction between enhancers and promoters is operational. An enhancer region 
as a whole must be able to stimulate transcription at a distance; this need not be true of a promoter 
region or its component elements. On the other hand, a promoter must have one or more elements 
10 that direct initiation of RNA synthesis at a particular site and in a particular orientation, whereas 
enhancers lack these specificities. Promoters and enhancers are often overlapping and contiguous, 
often seeming to have a very similar modular organization. Table V lists several enhancers, of 
course, this list is not meant to be limiting but exemplary. 



TABLE V 



Enhancer 


Inducer 


References 


MTII 


Phorbol Ester (TFA) 
Heavy metals 


Palmiter et al, 1982; Haslinger and 
Karin, 1985; Searle et al, 1985; Stuart 
et al, 1985; Imagawa et al, 1987; 
Karin ®, 1987; Angel et al, 1987b; 
McNealle/ al, 1989 


MMTV (mouse 
mammary tumor 
virus) 


Glucocorticoids 


Huang et al, 1981; Lee <?/<://., 1981; 
Majors and Varmus, 1983; Chandler 
etal, 1983; Lee etal, 1984; Fonta 
al, 1985; Sakai et al, 1986 


B-Interferon 


poly(rI)X 
poly(rc) 


Tavernier et al, 1 983 


Adenovirus 5 E2 


Ela 


Imperiale and Nevins, 1984 


Collagenase 


Phorbol Ester (TPA) 


Angle et al , 1987a 


Stromelysin 


Phorbol Ester (TPA) 


Angle et al, 1987b 
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TABLE V (Continued) 



Enhancer 


Inducer 


References 


SV40 


Phorbol Ester (TFA) 


Angel eta!., 1987b 


Murine MX Gene 


Interferon, 
Newcastle Disease 
Virus 




GRP78 Gene 


A23187 


Resendez et ai, 1988 


a-2-Macroglobulin 


IL-6 


KunzetaL, 1989 


Vimentin 


Serum 


Rittlinge/ a/., 1989 


MHC Class I Gene 
H-2kb 


Interferon 


Blanare/a/., 1989 


HSP70 


Ela ? SV40 Large T 
Antigen 


Taylor et aL, 1989; Taylor and 
Kingston. 1990a,b 


Proliferin 


Phorbol Ester-TPA 


Mordacq and Linzer, 1989 


Tumor Necrosis 
Factor 


FMA 


Hensel etai, 1989 


Thyroid 
Stimulating 
Hormone a Gene 


Thyroid Hormone 


Chatterjee et aL, 1989 



Additionally any promoter/enhancer combination (as per the Eukaryotic Promoter Data 
Base EPDB) could also be used to drive expression of a transgene. Use of a T3, T7 or SP6 
cytoplasmic expression system is another possible embodiment. Eukaryotic cells can support 
cytoplasmic transcription from certain bacterial promoters if the appropriate bacterial polymerase 
is provided, either as part of the delivery complex or as an additional genetic expression construct. 

Host cells include eukaryotic microbes, such as yeast cultures may also be used. 
Saccharomyces cerevisiae, or common baker's yeast is the most commonly used among 
eukaryotic microorganisms, although a number of other strains are commonly available. For 
expression in Saccharomyces, the plasmid YRp7> for example, is commonly used (Stinchcomb 
et aL, 1979; Kingsman et uL, 1979; Tschemper et aL, 1980). This plasmid already contains the 
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trpl gene which provides a selection marker for a mutant strain of yeast lacking the ability to 
grow in tryptophan, for example ATCC No. 44076 or PEP4-1 (Jones, 1977). The presence of 
the trpl lesion as a characteristic of the yeast host cell genome then provides an effective 
environment for detecting transformation by growth in the absence of tryptophan. 

5 Suitable promoting sequences in yeast vectors include the promoters for 3- 

phosphoglycerate kinase (Hitzeman et aL, 1980) or other glycolytic enzymes (Hess et aL, 1968; 
Holland et aL, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, 
pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- 
phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose 

10 isomerase, and glucokinase. In constructing suitable expression plasmids, the termination 
sequences associated with these genes are also ligated into the expression vector 3' of the 
sequence desired to be expressed to provide polyadenylation of the mRNA and termination. 
Other promoters, which have the additional advantage of transcription controlled by growth 
conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid 

15 phosphatase, degradative enzymes associated with nitrogen metabolism, and the 
aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for 
maltose and galactose utilization. Any plasmid vector containing a yeast-compatible promoter, 
origin of replication and termination sequences is suitable. 

In addition to eukaryotic microorganisms, cultures of cells derived from multicellular organisms 
20 may also be used as . hosts. In principle, any such cell culture is workable, whether from 
vertebrate or invertebrate culture. However, interest has been greatest in vertebrate cells, and 
propagation of vertebrate cells in culture (tissue culture) has become a routine procedure in 
recent years {Tissue Culture, 1973). Examples of such useful host cell lines are VERO and 
HeLa cells, Chinese hamster ovary (CHO) cell lines, and W138, BHK, COS-7, 293 and MDCK 
25 cell 1 ines. Expression vectors for such cells ordinarily include (if necessary) an origin of 
replication, a promoter located in front of the gene to be expressed, along with any necessary 
ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional terminator 
sequences. 
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4.0 Preparation of Antibodies to UspA Proteins 

Antibodies to UspAl or UspA2 peptides or polypeptides may be readily prepared 
through use of well-known techniques, such as those exemplified in U.S. Patent 4,196,265. 
Typically, this technique involves immunizing a suitable animal with a selected immunogen 
5 composition, e.g., purified or partially purified protein, synthetic protein or fragments thereof, 
as discussed in the section on vaccines. Animals to be immunized are mammals such as cats, 
dogs and horses, although there is no limitation other than that the subject be capable of 
mounting an immune response of some kind. The immunizing composition is administered in a 
manner effective to stimulate antibody producing cells. Rodents such as mice and rats are 
10 preferred animals, however, the use of rabbit, sheep or frog cells is possible. The use of rats 
may provide certain advantages, but mice are preferred, with the BALB/c mouse being most 
preferred as the most routinely used animal and one that generally gives a higher percentage of 
stable fusions. 

For generation of monoclonal antibodies (MAbs), following immunization, somatic 
15 cells with the potential for producing antibodies, specifically B lymphocytes (B cells), are 
selected for use in the MAb generating protocol. These cells may be obtained from biopsied 
spleens, tonsils or lymph nodes, or from a peripheral blood sample. Spleen cells and peripheral 
blood cells are preferred, the former because they are a rich source of antibody-producing cells 
that are in the dividing plasmablast stage, and the latter because peripheral blood is easily 
20 accessible. Often, a panel of animals will have been immunized and the spleen of the animal 
with the highest antibody titer removed. Spleen lymphocytes are obtained by homogenizing the 
spleen with a syringe. Typically, a spleen from an immunized mouse contains approximately 5 

7 8 

x 10 to 2 x 10 lymphocytes. 

The antibody-producing B cells from the immunized animal are then fused with cells of 
25 an immortal myeloma cell line, generally one of the same species as the animal that was 
immunized. Myeloma cell lines suited for use in hybridoma-producing fusion procedures 
preferably are non-antibody-producing, have high fusion efficiency and enzyme deficiencies 
that render them incapable of growing in certain selective media which support the growth of 
only the desired fused cells, called "hybridomas." 



BNSDOCID <WO 9828333A2 I > 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



Any one of a number of myeloma cells may be used and these are known to those of 
skill in the art. For example, where the immunized animal is a mouse, one may use 
P3-X63/Ag8, X63-Ag8.653. NSl/l.Ag 4 L Sp210-Agl4, FO, NSO/U, MPC-11, 
MPC1 1-X45-GTG 1.7 and S194/5XX0 Bui; for rats, one may use R210.RCY3, Y3-Ag 1.2.3, 
5 IR983F and 4B210; and U-266, GM1 500-GRG2, LICR-LON-HMy2 and UC729-6 are all 
useful in connection with human cell fusions. 

One preferred murine myeloma cell line is the NS-1 myeloma cell line (also termed 
P3-NS-l-Ag4-I), which is readily available from the NIGMS Human Genetic Mutant Cell 
Repository by requesting cell line repository number GM3573. Another mouse myeloma cell 
10 line that may be used is the 8-azaguanine-resistant mouse murine myeloma SP2/0 non-producer 
cell line. 

Methods for generating hybrids of antibody-producing spleen or lymph node cells and 
myeloma cells usually comprise mixing somatic cells with myeloma cells in a 2:1 proportion, 
though the proportion may vary from about 20:1 to about 1 : 1, respectively, in the presence of an 
15 agent or agents (chemical or electrical) that promote the fusion of cell membranes. Fusion 
methods using Sendai virus have been described by Kohler & Milstein (1975; 1976), and those 
using polyethylene glycol (PEG), such as 37% (v/v) PEG, by Gefter et al (1977). The use of 
electrically induced fusion methods is also appropriate. 

Fusion procedures usually produce viable hybrids at low frequencies, about 1 x 10" 6 to 
20 1x10. This does not pose a problem, however, as the viable, fused hybrids are differentiated 
from the parental, unfused cells (particularly the unfused myeloma cells that would normally 
continue to divide indefinitely) by culture in a selective medium. The selective medium 
generally is one that contains an agent that blocks the de novo synthesis of nucleotides in the 
tissue culture media. Exemplary and preferred agents are aminopterin, methotrexate and 
25 azaserine. Aminopterin and methotrexate block de novo synthesis of both purines and 
pyrimidines, whereas azaserine blocks only purine synthesis. Where aminopterin or 
methotrexate is used, the media is supplemented with hypoxanthine and thymidine as a source 
of nucleotides (HAT medium). Where azaserine is used, the media is supplemented with 
hypoxanthine. 
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The preferred selection medium is HAT. Only cells capable of operating nucleotide 
salvage pathways are able to survive in HAT medium. The myeloma cells are defective in key 
enzymes of the salvage pathway, e.g.. hypoxanthine phosphoribosyl transferase (HPRT), and 
they cannot survive. The B cells can operate this pathway, but they have a limited life span in 
5 culture and generally die within about two weeks. Therefore, the only cells that can survive in 
the selective media are those hybrids formed from myeloma and B cells. 

This culturing provides a population of hybridomas from which specific hybridomas are 
selected. Typically, selection of hybridomas is performed by single-clone dilution in microtiter 
plates, followed by testing the individual clonal supernatants (after about two to three weeks) 
10 for the desired reactivity. The assay should be sensitive, simple and rapid, such as 
radioimmunoassays, enzyme immunoassays, cytotoxicity assays, plaque assays, dot 
immunobinding assays, and the like. 

The selected hybridomas are then serially diluted and cloned into individual 
antibody-producing cell lines, which clones can then be propagated indefinitely to provide 

15 MAbs. The cell lines may be exploited for MAb production in two basic ways. A sample of 
the hybridoma can be injected, usually in the peritoneal cavity, into a histocompatible animal of 
the type that was used to provide the somatic and myeloma cells for the original fusion. The 
injected animal develops tumors secreting the specific monoclonal antibody produced by the 
fused cell hybrid. The body fluids of the animal, such as serum or ascites fluid, can then be 

20 tapped to provide MAbs in high concentration. The individual cell lines could also be cultured 
in vitro, where the MAbs are naturally secreted into the culture medium from which they can be 
readily obtained in high concentrations. MAbs produced by either means may be further 
purified, if desired, using filtration, centrifugation and various chromatographic methods such 
as HPLC or affinity chromatography. 

25 Monoclonal antibodies of the present invention also include anti-idiotypic antibodies 

produced by methods well-known in the art. Monoclonal antibodies according to the present 
invention also may be monoclonal heteroconjugates, i.e., hybrids of two or more antibody 
molecules. In another embodiment, monoclonal antibodies according to the invention are 
chimeric monoclonal antibodies. In one approach, the chimeric monoclonal antibody is 

30 engineered by cloning recombinant DNA containing the promoter, leader, and variable-region 
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sequences from a mouse antibody producing cell and the constant-region exons from a human 
antibody gene. The antibody encoded by such a recombinant gene is a mouse-human chimera. 
Its antibody specificity is determined by the variable region derived from mouse sequences. Its 
isotype, which is determined by the constant region, is derived from human DNA. 

5 In another embodiment, the monoclonal antibody according to the present invention is a 

"humanized" monoclonal antibody, produced by techniques well-known in the art. That is, 
mouse complementary determining regions ("CDRs") are transferred from heavy and light 
V-chains of the mouse Ig into a human V-domain, followed by the replacement of some human 
residues in the framework regions of their murine counterparts. "Humanized" monoclonal 
10 antibodies in accordance with this invention are especially suitable for use in in vivo diagnostic 
and therapeutic methods for treating Moraxella infections. 

As stated above, the monoclonal antibodies and fragments thereof according to this 
invention can be multiplied according to in vitro and in vivo methods well-known in the art. 
Multiplication in vitro is carried out in suitable culture media such as Dulbecco's modified 

15 Eagle medium or RPMI 1640 medium, optionally replenished by a mammalian serum such as 
fetal calf serum or trace elements and growth-sustaining supplements, e.g., feeder cells, such as 
normal mouse peritoneal exudate cells, spleen cells, bone marrow macrophages or the like. In 
vitro production provides relatively pure antibody preparations and allows scale-up to give large 
amounts of the desired antibodies. Techniques for large scale hybridoma cultivation under 

20 tissue culture conditions are known in the art and include homogenous suspension culture, e.g., 
in an airlift reactor or in a continuous stirrer reactor or immobilized or entrapped cell culture. 

Large amounts of the monoclonal antibody of the present invention also may be 
obtained by multiplying hybridoma cells in vivo. Cell clones are injected into mammals which 
are histocompatible with the parent cells, e.g.. syngeneic mice, to cause growth of 
25 antibody-producing tumors. Optionally, the animals are primed with a hydrocarbon, especially 
oils such as Pristane (tetramethylpentadecane) prior to injection. 

In accordance with the present invention, fragments of the monoclonal antibody of the 
invention can be obtained from monoclonal antibodies produced as described above, by 
methods which include digestion with enzymes such as pepsin or papain and/or cleavage of 
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disulfide bonds by chemical reduction. Alternatively, monoclonal antibody fragments 
encompassed by the present invention can be synthesized using an automated peptide 
synthesizer, or they may be produced manually using techniques well known in the art. 

The monoclonal conjugates of the present invention are prepared by methods known in 
5 the art, e.g., by reacting a monoclonal antibody prepared as described above with, for instance, 
an enzyme in the presence of a coupling agent such as glutaraldehyde or periodate. Conjugates 
with fluorescein markers are prepared in the presence of these coupling agents, or by reaction 
with an isothiocyanate. Conjugates with metal chelates are similarly produced. Other moieties 
to which antibodies may be conjugated include radionuclides such as 3 H, l25 I, I31 I 32 P, 35 S, H C, 

10 "Cr, "CI, "Co, 30 Co, "Fe, "Se, '"Eu, and vy mTc, are other useful labels which can be 
conjugated to antibodies. Radio-labeled monoclonal antibodies of the present invention are 
produced according to well-known methods in the art. For instance, monoclonal antibodies can 
be iodinated by contact with sodium or potassium iodide and a chemical oxidizing agent such as 
sodium hypochlorite, or an enzymatic oxidizing agent, such as lactoperoxidase. Monoclonal 

15 antibodies according to the invention may be labeled with technetium-"m by ligand exchange 
process, for example, by reducing pertechnate with stannous solution, chelating the reduced 
technetium onto a Sephadex column and applying the antibody to this column or by direct 
labeling techniques, e.g., by incubating pertechnate, a reducing agent such as SNC1 2 , a buffer 
solution such as sodium-potassium phthalate solution, and the antibody. 

20 5.0 Use of Peptides and Monoclonal Antibodies in Immunoassays 

It is proposed that the monoclonal antibodies of the present invention will Find useful 
application in standard immunochemical procedures, such as ELISA and western blot methods, 
as well as other procedures which may utilize antibodies speciFic to CopB epitopes. While 
ELISAs are preferred, it will be readily appreciated that such assays include RIAs and other 
25 non-enzyme linked antibody binding assays or procedures. Additionally, it is proposed that 
monoclonal antibodies specific to the particular UspA epitope may be utilized in other useful 
applications. For example, their use in immunoahsorbeni protocols may be useful in purifying 
native or recombinant UspA proteins or variants thereof. 

It also is proposed that the disclosed UspA I and UspA2 peptides of the invention will 
30 find use as antigens for raising antibodies and in immunoassays for the detection of anti-UspA 
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antigen-reactive antibodies. In a variation on this embodiment, UspAl and UspA2 mutant 
peptides may be screened, in immunoassay format, for reactivity against UspAl- or 
UspA2-specific antibodies, such as MAb 17C7. In this way, a mutational analysis of various 
epitopes may be performed. Results from such analyses may then be used to determine which 
additional UspAl or UspA2 epitopes may be recognized by antibodies and useful in the 
preparation of potential vaccines for Moraxella. 

Diagnostic immunoassays include direct culturing of bodily fluids, either in liquid 
culture or on a solid support such as nutrient agar. A typical assay involves collecting a sample 
of bodily fluid from a patient and placing the sample in conditions optimum for growth of the 
pathogen. The determination can then be made as to whether the microbe exists in the sample. 
Further analysis can be carried out to determine the hemolyzing properties of the microbe. 

Immunoassays encompassed by the present invention include, but are not limited to 
those described in U.S. Patent No. 4,367,1 10 (double monoclonal antibody sandwich assay) and 
U.S. Patent No. 4,452,901 (western blot). Other assays include immunoprecipitation of labeled 
ligands and immunocytochemistry, both in vitro and in vivo. 

Immunoassays, in their most simple and direct sense, are binding assays. Certain 
preferred immunoassays are the various types of enzyme linked immunosorbent assays 
(ELISAs) and radioimmunoassays (RIAs) known in the art. Immunohistochemical detection 
using tissue sections is also particularly useful. However, it will be readily appreciated that 
detection is not limited to such techniques, and western blotting, dot blotting, FACS analyses, 
and the like may also be used. 

In one exemplary ELISA, the anti-UspA antibodies of the invention are immobilized 
onto a selected surface exhibiting protein affinity, such as a well in a polystyrene microtiter 
plate. Then, a test composition suspected of containing the desired antigen, such as a clinical 
sample, is added to the wells. After binding and washing to remove non-specifically bound 
immune complexes, the bound antigen may be detected. Detection is generally achieved by the 
addition of another antibody, specific for the desired antigen, that is linked to a detectable label. 
This type of ELISA is a simple "sandwich ELISA". Detection may also be achieved by the 
addition of a second antibody specific for the desired antigen, followed by the addition of a 
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third antibody that has binding affinity for the second antibody, with the third antibody being 
linked to a detectable label. 

In another exemplary ELISA ? the samples suspected of containing the UspA antigen are 
immobilized onto the well surface and then contacted with the anti-UspA antibodies. After 
binding and appropriate washing, the bound immune complexes are detected. Where the initial 
antigen specific antibodies are linked to a detectable label, the immune complexes may be 
detected directly. Again, the immune complexes may be detected using a second antibody that 
has binding affinity for the first antigen specific antibody, with the second antibody being 
linked to a detectable label. 

Further methods include the detection of primary immune complexes by a two step 
approach. A second binding ligand, such as an antibody, that has binding affinity for the 
primary antibody is used to form secondary immune complexes, as described above. After 
washing, the secondary immune complexes are contacted with a third binding ligand or 
antibody that has binding affinity for the second antibody, again under conditions effective and 
for a period of time sufficient to allow the formation of immune complexes (tertiary immune 
complexes). The third ligand or antibody is linked to a detectable label, allowing detection of 
the tertiary immune complexes thus formed. This system may provide for signal amplification 
if desired. 

Competition ELISAs are also possible in which test samples compete for binding with 
known amounts of labeled antigens or antibodies. The amount of reactive species in the 
unknown sample is determined by mixing the sample with the known labeled species before or 
during incubation with coated wells. (Antigen or antibodies may also be linked to a solid 
support, such as in the form of beads, dipstick, membrane or column matrix, and the sample to 
be analyzed applied to the immobilized antigen or antibody.) The presence of reactive species 
in the sample acts to reduce the amount of labeled species available for binding to the well and 
thus reduces the ultimate signal. 

Irrespective of the format employed, ELISAs have certain features in common, such as 
coating, incubating or binding, washing to remove non-specifically bound species, and 
detecting the bound immune complexes. These are described below. 
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In coating a plate with either antigen or antibody, one will generally incubate the wells 
of the plate with a solution of the antigen or antibody, either overnight or for a specified period. 
The wells of the plate will then be washed to remove incompletely adsorbed material. Any 
remaining available surfaces of the wells are then "coated" with a nonspecific protein that is 
antigenically neutral with regard to the test antisera. These include bovine serum albumin 
(BSA), casein and solutions of milk powder. The coating allows for blocking of nonspecific 
adsorption sites on the immobilizing surface and thus reduces the background caused by 
nonspecific binding of antisera onto the surface. 

After binding of antigenic material to the well, coating with a non-reactive material to 
reduce background, and washing to remove unbound material, the immobilizing surface is 
contacted with the antisera or clinical or biological extract to be tested in a manner conducive to 
immune complex (antigen/antibody) formation. Such conditions preferably include diluting the 
antisera with diluents such as BSA, bovine gamma globulin (BGG) and phosphate buffered 
saline (PBS)/Tween. These added agents also tend to assist in the reduction of nonspecific 
background. The layered antisera is then allowed to incubate for from 2 to 4 hours, at 
temperatures preferably on the order of 25° to 27°C. Following incubation, the antisera- 
contacted surface is washed so as to remove non-immunocomplexed material. A preferred 
washing procedure includes washing with a solution such as PBS/Tween, or borate buffer. 

Following formation of specific immunocomplexes between the test sample and the 
bound antigen, and subsequent washing, the occurrence and even amount of immunocomplex 
formation may be determined by subjecting same to a second antibody having specificity for the 
first. Of course, in that the test sample will typically be of human origin, the second antibody 
will preferably be an antibody having specificity in general for human IgG. To provide a 
detecting means, the second antibody will preferably have an associated enzyme that will 
generate a color development upon incubating with an appropriate chromogenic substrate. 
Thus, for example, one will desire to contact and incubate the antisera-bound surface with a 
urease or peroxidase-conjugated anti-human IgG for a period of time and under conditions 
which favor the development of immunocomplex formation {e.g., incubation for 2 hours at 
room temperature in a PBS-containing solution such as PBS-Tween). 
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After incubation with the second enzyme-tagged antibody, and subsequent to washing to 
remove unbound material, the amount of label is quantified by incubation with a chromogenic 
substrate such as urea and bromocresol purple or 2,2'-azino-di-(3-ethyl-benzthiazoIine-6- 
sulfonic acid [ABTS] and H 2 0 2 , in the case of peroxidase as the enzyme label. Quantification 
is then achieved by measuring the degree of color generation, e.g., using a visible spectra 
spectrophotometer. Alternatively, the label may be a chemilluminescent one. The use of such 
labels is described in U.S. Patent Nos. 5,310,687, 5,238,808 and 5,221,605. 

6.0 Prophylactic Use of UspA Peptides and UspA-Specific Antibodies 

In a further embodiment of the present invention, there are provided methods for active 
and passive immunoprophylaxis. Active immunoprophylaxis will be discussed first, followed 
by a discussion on passive immunoprophylaxis. It should be noted that the discussion of 
formulating vaccine compositions in the context of active immunotherapy is relevant to the 
raising antibodies in experimental animals for passive immunotherapy and for the generation of 
diagnostic methods. 

6.1 Active Immunotherapy 

According to the present invention, (JspAl or UspA2 polypeptides or UspAl- or 
UspA2-derived peptides, as discussed above, may be used as vaccine formulations to generate 
protective anti-M catarrhalis antibody responses in vivo. By protective, it is only meant that 
the immune system of a treated individual is capable of generating a response that reduces, to 
any extent, the clinical impact of the bacterial infection. This may range from a minimal 
decrease in bacterial burden to outright prevention of infection. Ideally, the treated subject will 
not exhibit the more serious clinical manifestations of M. catarrhalis infection. 

Generally, immunoprophylaxis involves the administration, to a subject at risk, of a 
vaccine composition. In the instant case, the vaccine composition will contain a UspAl and/or 
UspA2 polypeptide or immunogenic derivative thereof in a pharmaceutically acceptable carrier, 
diluent or excipient. As stated above, those of skill in the art are able, through a variety of 
mechanisms, to identify appropriate antigenic characteristics of UspAl and UspA2 and , in so 
doing, develop vaccines that will achieve generation of immune responses against M 
catarrhalis. 
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The stability and immunogenicity of UspAl and UspA2 antigens may vary and, 
therefore, it may be desirable to couple the antigen to a carrier molecule. Exemplary carriers 
are KLH, BSA, human serum albumin, myoglobin, p-gaiactosidase, penicillinase, CRM| 97 and 
bacterial toxoids, such as diphtheria toxoid and tetanus toxoid. Those of skill in the art are 
aware of proper methods by which peptides can be linked to carriers without destroying their 
immunogenic value. Synthetic carriers such as multi-poly-DL-alanyl-poly-L-lysine and poly-L- 
lysine also are contemplated. Coupling generally is accomplished through amino or carboxyl- 
terminal residues of the antigen, thereby affording the peptide or polypeptide the greatest 
chance of assuming a relatively "native'" conformation following coupling. 

It is recognized that other protective agents could be coupled with either a UspAl or 
UspA2 antigen such that the UspAl or UspA2 antigen acts as the carrier molecule. For 
example, agents which protect against other pathogenic organisms, such as bacteria, viruses or 
parasites, could be coupled to either a UspAl or UspA2 antigen to produce a multivalent, 
vaccine or pharmaceutical composition which would be useful for the treatment or inhibition of 
both M. catarrhalis infection and other pathogenic infections. In particular, it is envisioned that 
either UspAl or UspA2 proteins or peptides could serve as immunogenic carriers for other 
vaccine components, for example, saccharides of pneurnococcus, menigococcus or hemophylus 
influenza and could even be covalently coupled to these other components. 

It also may be desirable to include in the composition any of a number of different 
substances referred to as adjuvants, which are known to stimulate the appropriate portion of the 
immune system of the vaccinated animal. Suitable adjuvants for the vaccination of subjects 
(including experimental animals) include, but are not limited to oil emulsions such as Freund's 
complete or incomplete adjuvant (not suitable for livestock use), Marcol 52:Montanide 888 
(Marcol is a Trademark of Esso, Montanide is a Trademark of SEPPIC, Paris), squalane or 
squalene, Adjuvant 65 (containing peanut oil, mannide monooleate and aluminum 
monostearate), MPL™ (3-O-deacylated monophosphoryl lipid A; RIBI ImmunoChem Research 
Inc., Hamilton, Utah), Stimulon™ (QS-21; Aquila Biopharmaceuticals Inc., Wooster, MA), 
mineral gels such as aluminum hydroxide, aluminum phosphate, calcium phosphate and alum, 
surfactants such as hexadecylamine, octadccylamine, lysolecithin, dimethyl- 
dioctadecylammonium bromide, N,N-dioctadecyl-N,N'-bis(2-hydroxyethyl)-propanediamine, 
methoxyhexadecylglycerol and pluronic polyols, polyanions such as pyran, dextran sulfate, 
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polyacrylic acid and carbopol, peptides and amino acids such as muramyl dipeptide, 
dimethylglycine, tuftsin and trehalose dimycolate. Agents include synthetic polymers of sugars 
(Carbopol), emulsion in physiologically acceptable oil vehicles such as mannide mono-oleate 
(Aracel A) or emulsion with 20 percent solution, of a perfluorocarbon (Fluosol-DA) also may be 
5 employed. 

The preparation of vaccines which contain peptide sequences as active ingredients is 
generally well understood in the art, as exemplified by U.S. Patents 4,608,251; 4,601,903; 
4,599,231; 4,599,230; 4,596,792; and 4.578,770, all incorporated herein by reference. 
Typically, such vaccines are prepared as injectables. Either as liquid solutions or suspensions: 

10 Solid forms suitable for solution in, or suspension in, liquid prior to injection may also be 
prepared. The preparation may also be emulsified. The active immunogenic ingredient is often 
mixed with excipients which are pharmaceutically acceptable and compatible with the active 
ingredient. Suitable excipients arc, for example, water, saline, dextrose, glycerol, ethanol, or 
the like and combinations thereof. In addition, if desired, the vaccine may contain minor 

15 amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, or 
adjuvants which enhance the effectiveness of the vaccines. 

The vaccine preparations of the present invention also can be administered following 
incorporation into non-toxic carriers such as liposomes or other microcarrier substances, or after 
conjugation to polysaccharides, proteins or polymers or in combination with Quil-A to form 

20 "iscoms" (immunostimulating complexes). These complexes can serve to reduce the toxicity of 
the antigen, delay its clearance from the host and improve the immune response by acting as an 
adjuvant. Other suitable adjuvants for use this embodiment of the present invention include 
INF, IL-2, IL-4, IL-8, IL-12 and other immunostimulatory compounds. Further, conjugates 
comprising the immunogen together with an integral membrane protein of prokaryotic origin, 

25 such as TraT (see PCT/AU87/00107) may prove advantageous. 

The vaccines are conventionally administered parenterally, by injection, for example, 
either subcutaneously or intramuscularly. Additional formulations which are suitable for other 
modes of administration include suppositories and, in some cases, oral formulations. For 
suppositories, traditional binders and carriers may include, for example, polyalkalene glycols or 
30 triglycerides: such suppositories may be formed from mixtures containing the active ingredient 
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in the range of 0.5% to 10%, preferably 1-2%. Oral formulations include such normally 
employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate. sodium saccharine, cellulose, magnesium carbonate and the like. These 
compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 
formulations or powders and contain 10-95% of active ingredient, preferably 25-70%. 

The peptides may be formulated into the vaccine as neutral or salt forms. 
Pharmaceutically acceptable salts, include the acid addition salts (formed with the free amino 
groups of the peptide) and which are formed with inorganic acids such as, for example, 
hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and 
the like. Salts formed with the free carboxyl groups may also be derived from inorganic bases 
such as. for example, sodium, potassium, ammonium, calcium, or feme hydroxides, and such 
organic bases as isopropylamine, trirnethylamine, 2-ethylamino ethanol, histidine, procaine, and 
the like. 

The vaccines are administered in a manner compatible with the dosage formulation, and 
in such amount as will be therapeutically effective and immunogenic. The quantity to be 
administered depends on the subject to be treated, including, e.g., the capacity of the 
individual's immune system to synthesize antibodies, and the degree of protection desired. 
Precise amounts of active ingredient required to be administered depend on the judgment of the 
practitioner. However, suitable dosage ranges are of the order of several hundred micrograms 
active ingredient per vaccination. Suitable regimes for initial administration and booster shots 
are also variable, but are typified by an initial administration followed by subsequent 
inoculations or other administrations. 

The manner of application may be varied widely. Any of the conventional methods for 
administration of a vaccine are applicable. These are believed to include oral application on a 
solid physiologically acceptable base or in a physiologically acceptable dispersion, parenterally, 
by injection or the like. The dosage of the vaccine will depend on the route of administration 
and will vary according to the size of the host. 

In many instances, it will be desirable to have multiple administrations of the vaccine, 
usually not exceeding six vaccinations, more usually not exceeding four vaccinations and 
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preferably one or more, usually at least about three vaccinations. The vaccinations will 
normally be at from two to twelve week intervals, more usually from three to five week 
intervals. Periodic boosters at intervals of 1-5 years, usually three years, will be desirable to 
maintain protective levels of the antibodies. The course of the immunization may be followed 
by assays for antibodies for the supernatant antigens. The assays may be performed by labeling 
with conventional labels, such as radionuclides, enzymes, fluorescers, and the like. These 
techniques are well known and may be found in a wide variety of patents, such as U.S. Patent 
Nos. 3,791,932; 4,174,384 and 3,949,064, as illustrative of these types of assays. 

6.2 Passive Immunotherapy 

Passive immunity is defined, for the purposes of this application, as the transfer to an 
organism of an immune response effector that was generated in another organism. The classic 
example of establishing passive immunity is to transfer antibodies produced in one organism 
into a second, immunologically compatible animal. By "immunologically compatible," it is 
meant that the antibody can perform at least some of its immune functions in the new host 
animal. More recently, as a better understanding of cellular immune functions has evolved, it 
has become possible to accomplish passive immunity by transferring other effectors, such as 
certain kinds of lymphocytes, including cytotoxic and helper T cells, NK cells and other 
immune effector cells. The present invention contemplates both of these approaches. 

Antibodies, antisera and immune effector cells are raised using standard vaccination 
regimes in appropriate animals, as discussed above. The primary animal is vaccinated with at 
least a microbe preparation or one bacterial product or by-product according to the present 
invention, with or without an adjuvant, to generate an immune response. The immune response 
may be monitored, for example, by measurement of the levels of antibodies produced, using 
standard ELISA methods. 

Once an adequate immune response has been generated, immune effector cells can be 
collected on a regular basis, usually from blood draws. The antibody fraction can be purified 
from the blood by standard means, e.g., by protein A or protein G chromatography. In an 
alternative preferred embodiment, monoclonal antibody-producing hybridomas are prepared by 
standard means (Coligan et aL* 1991). Monoclonal antibodies are then prepared from the 
hybridoma cells by standard means. If the primary host's monoclonal antibodies are not 
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compatible with the animal to be treated, it is possible that genetic engineering of the cells can 
be employed to modify the antibody to be tolerated by the animal to be treated. In the human 
context, murine antibodies, for example, may be "humanized" in this fashion. 

Antibodies, antisera or immune effector cells, prepared as set forth above, are injected 
into hosts to provide passive immunity against microbial infestation. For example, an antibody 
composition is prepared by mixing, preferably homogeneously mixing, at least one antibody 
with at least one pharmaceutically or veterinarally acceptable carrier, diluent, or excipient using 
standard methods of pharmaceutical or veterinary preparation. The amount of antibody 
required to produce a single dosage form will vary depending upon the microbial species being 
vaccinated against, the individual to be treated and the particular mode of administration. The 
specific dose level for any particular individual will depend upon a variety of factors including 
the age, body weight, general health, sex, and diet of the individual, time of administration, 
route of administration, rate of excretion, drug combination and the severity of the microbial 
infestation. 

The antibody composition may be administered intravenously, subcutaneously, 
intranasally, orally, intramuscularly, vaginally, rectally, topically or via any other desired route. 
Repeated dosings may be necessary and will vary, for example, depending on the clinical 
setting, the particular microbe, the condition of the patient and the use of other therapies. 

6.3 DNA Immunization HC 

The invention also relates to a vaccine comprising a nucleic acid molecule encoding a 
UspAl, UspA2 protein or a peptide comprsing SEQ ID NO: 17 wherein said UspAl, UspA2 
protein or peptide retains immunogenicity and. when incorporated into an immunogenic 
composition or vaccine and administered to a vertebrate, provides protection without inducing 
enhanced disease upon subsequent infection of the vertebrate with M calarrhalis, and a 
physiologically acceptable vehicle. Such a vaccine is referred to herein as a nucleic acid 
vaccine or DNA vaccine and is useful for the genetic immunization of vertebrates. 

The term, "genetic immunization", as used herein, refers to inoculation of a vertebrate, 
particularly a mammal such as a mouse or human, with a nucleic acid vaccine directed against a 
pathogenic agent, particularly M catarrhalis, resulting in protection of the vertebrate against M. 
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catarrhalis. A "nucleic acid vaccine" or "DNA vaccine" as used herein, is a nucleic acid 
construct comprising a nucleic acid molecule encoding UspAl, UspA2 or an immunogenic 
epitope comprising SEQ ID NO: 17. The nucleic acid construct can also include transcriptional 
promoter elements, enhancer elements, splicing signals, termination and polyadenylation 
signals, and other nucleic acid sequences. 

The nucleic acid vaccine can be produced by standard methods. For example, using 
known methods, a nucleic acid (e.g., DNA) encoding UspAl or UspA2 can be inserted into an 
expression vector to construct a nucleic acid vaccine (see Maniatis et aL, 1989). The individual 
vertebrate is inoculated with the nucleic acid vaccine (i.e., the nucleic acid vaccine is 
administered), using standard methods. The vertebrate can be inoculated subcutaneously, 
intravenously, intraperitoneal^, intradermally, intramuscularly, topically, orally, rectally, 
nasally, buccally, vaginally, by inhalation spray, or via an implanted reservoir in dosage 
formulations containing conventional non-toxic, physiologically acceptable carriers or vehicles. 
Alternatively, the vertebrate is inoculated with the nucleic acid vaccine through the use of a 
particle acceleration instrument (a "gene gun 1 ')- The form in which it is administered (e.g., 
capsule, tablet, solution, emulsion) wilfdepend in part on the route by which it is administered. 
For example, for mucosal administration, nose drops, inhalants or suppositories can be used. 

The nucleic acid vaccine can be administered in conjunction with any suitable adjuvant. 
The adjuvant is administered in a sufficient amount, which is that amount that is sufficient to 
generate an enhanced immune response to the nucleic acid vaccine. The adjuvant can be 
administered prior to (e.g., 1 or more days before) inoculation with the nucleic acid vaccine; 
concurrently with (e.g., within 24 hours of) inoculation with the nucleic acid vaccine; 
contemporaneously (simultaneously) with the nucleic acid vaccine (e.g., the adjuvant is mixed 
with the nucleic acid vaccine, and the mixture is administered to the vertebrate); or after (e.g., 1 
or more days after) inoculation with the nucleic acid vaccine. The adjuvant can also be 
administered at more than one time (e.g., prior to i inoculation with the nucleic acid vaccine and 
also after inoculation with the nucleic acid vaccine). As used herein, the term "in conjunction 
with 1 ' encompasses any time period, including those specifically described herein and 
combinations of the time periods specifically described herein, during which the adjuvant can 
be administered so as to generate an enhanced immune response to the nucleic acid vaccine 
(e.g., an increased antibody titer to the antigen encoded by the nucleic acid vaccine, or an 
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increased antibody titer to M catarrhalis). The adjuvant and the nucleic acid vaccine can be 
administered at approximately the same location on the vertebrate; for example, both the 
adjuvant and the nucleic acid vaccine are administered at a marked site on a limb of the 
vertebrate. 

5 In a particular embodiment, the nucleic acid construct is co-administered with a 

transfection-facilitating agent. In a preferred embodiment, the transfection-facilitating agent is 
dioctylglycylspermine (DOGS) (as exemplified in published PCT application publication no. 
WO 96/21356 and incorporated herein by reference). In another embodiment, the transfection- 
facilitating agent is bupivicaine (as exemplified in U.S. Patent 5,593,972 and incorporated 
1 0 herein by reference). 

6.4 Animal Model for Testing Efficacy of Therapies 

The evaluation of the functional significance of antibodies to surface antigens of M. 
catarrhalis has been hampered by the lack of a suitable animal model. The relative lack of 
virulence of this organism for animals rendered identification of an appropriate model system 
15 difficult (Doern, 1986). Attempts to use rodents, including chinchillas, to study middle ear 
infections caused by M. catarrhalis were unsuccessful, likely because this organism cannot 
grow or survive in the middle ear of these hosts (Doyle, 1989). 

Murine short-term pulmonary clearance models have now been developed (Unhanand et 
ai, 1992; Verghese et aL, 1990) which permit an evaluation of the interaction of M. catarrhalis 

20 with the lower respiratory tract as well as assessment of pathologic changes in the lungs. This 
model reproducibly delivers an inoculum of bacteria to a localized peripheral segment of the 
murine lung. Bacteria multiply within the lung, but are eventually cleared as a result of (i) 
resident defense mechanisms, (ii) the development of an inflammatory response, and/or (iii) the 
development of a specific immune response. Using this model, it has been demonstrated that 

25 serum IgG antibody can enter the alveolar spaces in the absence of an inflammatory response 
and enhance pulmonary clearance of nontypable //. influenzae (McGehee et ai. 1989), a 
pathogen with a host range and disease spectrum nearly identical to those of M catarrhalis. 
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7.0 Screening Assays 

In still further embodiments, the present invention provides methods for identifying new 
M catarrhalis inhibitory compounds, which may be termed as "candidate substances," by 
screening for immunogenic activity with peptides that include one or more mutations to the 
identified immunogenic epitopic region. It is contemplated that such screening techniques will 
prove useful in the general identification of any compound that will serve the purpose of 
inhibiting, or even killing, M catarrhalis, and in preferred embodiments, will provide candidate 
vaccine compounds. 



10 It is further contemplated that useful compounds in this regard will in no way be limited 

to proteinaceous or peptidyl compounds. In fact, it may prove to be the case that the most 
useful pharmacological compounds for identification through application of the screening 
assays will be non-peptidyl in nature and, e.g., which will serve to inhibit bacterial protein 
transcription through a tight binding or other chemical interaction. Candidate substances may 

15 be obtained from libraries of synthetic chemicals, or from natural samples, such as rain forest 
and marine samples. 



To identify a M. catarrhalis inhibitor, one would simply conduct parallel or otherwise 
comparatively controlled immunoassays and identify a compound that inhibits the phenotype of 
20 M catarrhalis. Those of skill in the art are familiar with the use of immunoassays for 
competitive screenings (for example refer to Sambrook et al. 1989). 

Once a candidate substance is identified, one would measure the ability of the candidate 
substance to inhibit M catarrhalis in the presence of the candidate substance. In general, one 
25 will desire to measure or otherwise determine the activity of M. catarrhalis in the absence of 
the added candidate substance relative to the activity in the presence of the candidate substance 
in order to assess the relative inhibitory capability of the candidate substance. 

7.1 Mutagenesis 

30 Site-specific mutagenesis is a technique useful in the preparation of individual peptides, 

or biologically functional equivalent proteins or peptides, through specific mutagenesis of the 
underlying DNA. The technique further provides a ready ability to prepare and test sequence 
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variants, incorporating one or more of the foregoing considerations, by introducing one or more 
nucleotide sequence changes into the DNA. Site-specific mutagenesis allows the production of 
mutants through the use of specific oligonucleotide sequences which encode the DNA sequence 
of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a 
5 primer sequence of sufficient size and sequence complexity to form a stable duplex on both 
sides of the deletion junction being traversed. Typically, a primer of about 17 to 25 nucleotides 
in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence 
being altered. 

10 In general, the technique of site-specific mutagenesis is well known in the art. as will be 

appreciated, the technique typically employs a bacteriophage vector that exists in both a single 
stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include 
vectors such as the Ml 3 phage. These phage vectors are commercially available and their use is 
generally well known to those skilled in the art. Double stranded plasmids are also routinely 

15 employed in site directed mutagenesis, which eliminates the step of transferring the gene of 
interest from a phage to a plasmid. 

In general, site-directed mutagenesis is performed by first obtaining a single-stranded 
vector, or melting of two strands of a double stranded vector which includes within its sequence 

20 a DNA sequence encoding the desired protein. An oligonucleotide primer bearing the desired 
mutated sequence is synthetically prepared. This primer is then annealed with the single- 
stranded DNA preparation, and subjected to DNA polymerizing enzymes such as E coli 
polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing 
strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated 

25 sequence and the second strand bears the desired mutation. This heteroduplex vector is then 
used to transform appropriate cells, such as E. coli cells, and clones are selected that include 
. recombinant vectors bearing the mutated sequence arrangement. 

The preparation of sequence variants of the selected gene using site-directed 
30 mutagenesis is provided as a means of producing potentially useful species and is not meant to 
be limiting, as there are other ways in which sequence variants of genes may be obtained. For 
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example, recombinant vectors encoding the desired gene may be treated with mutagenic agents, 
such as hydrox ylamine, to obtain sequence variants. 

7.2 Second Generation Inhibitors 

In addition to the inhibitory compounds initially identified, the inventor also 
contemplates that other sterically similar compounds may be formulated to mimic the key 
portions of the structure of the inhibitors. Such compounds, which may include 
peptidomimetics of peptide inhibitors, may be used in the same manner as the initial inhibitors. 

Certain mimetics that mimic elements of protein secondary structure are designed using 
the rationale that the peptide backbone of proteins exists chiefly to orientate amino acid side 
chains in such a way as to facilitate molecular interactions. A peptide mimetic is thus designed 
to permit molecular interactions similar to the natural molecule. 

Some successful applications of the peptide mimetic concept have focused on mimetics 
of p-turns within proteins, which are known to be highly antigenic. Likely (3-turn structure 
within a polypeptide can be predicted by computer-based algorithms, as discussed herein. Once 
the component amino acids of the turn are determined, mimetics can be constructed to achieve a 
similar spatial orientation of the essential elements of the amino acid side chains. 

The generation of further structural equivalents or mimetics may be achieved by the 
techniques of modeling and chemical design known to those of skill in the art. The art of 
computer-based chemical modeling is now well known. Using such methods, a chemical that 
specifically inhibits viral transcription elongation can be designed, and then synthesized, 
following the initial identification of a compound that inhibits RNA elongation, but that is not 
specific or sufficiently specific to inhibit viral RNA elongation in preference to human RNA 
elongation. It will be understood that all such sterically similar constructs and second 
generation molecules fall within the scope of the present invention. 
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8.0 Diagnosing M. caiarrhalis Infections 

8.1 Amplification and PCR™ 

Nucleic acid sequence used as a template for amplification is isolated from cells 
contained in the biological sample, according to standard methodologies (Sambrook et al, 
1989). The nucleic acid may be genomic DNA or fractionated or whole cell RNA. Where 
RNA is used, it may be desired to convert the RNA to a cDNA. 

Pairs of primers that selectively hybridize to nucleic acids corresponding to UspAl or 
UspA2 protein or a mutant thereof are contacted with the isolated nucleic acid under conditions 
that permit selective hybridization. The term "primer", as defined herein, is meant to encompass 
any nucleic acid that is capable of priming the synthesis of a nascent nucleic acid in a template- 
dependent process. Typically, primers are oligonucleotides from ten to twenty base pairs in 
length, but longer sequences can be employed. Primers may be provided in double-stranded or 
single-stranded form, although the single-stranded form is preferred. 

Once hybridized, the nucleic acid:primer complex is contacted with one or more 
enzymes that facilitate template-dependent nucleic acid synthesis. Multiple rounds of 
amplification, also referred to as "cycles," are conducted until a sufficient amount of 
amplification product is produced. 

Next, the amplification product is detected. In certain applications, the detection may be 
performed by visual means. Alternatively, the detection may involve indirect identification of 
the product via chemiluminescence, radioactive scintigraphy of incorporated radiolabel or 
fluorescent label or even via a system using electrical or thermal impulse signals (Affymax 
technology). 

A number of template dependent processes are available to amplify the marker 
sequences present in a given template sample. One of the best known amplification methods is 
the polymerase chain reaction (referred to as PCR™) which is described in detail in U.S. Patent 
Nos. 4,683,195, 4,683,202 and 4,800,159, and each incorporated herein by reference in entirety. 
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Briefly, in PCR™, two primer sequences are prepared that are complementary to regions 
on opposite complementary strands of the marker sequence. An excess of deoxynucleoside 
triphosphates are added to a reaction mixture along with a DNA polymerase, e.g., Taq 
polymerase. If the marker sequence is present in a sample, the primers will bind to the marker 
5 and the polymerase will cause the primers to be extended along the marker sequence by adding 
on nucleotides. By raising and lowering the temperature of the reaction mixture, the extended 
primers will dissociate from the marker to form reaction products, excess primers will bind to 
the marker and to the reaction products and the process is repeated. 

10 A reverse transcriptase PCR™ (RT-PCR™) amplification procedure may be performed 

in order to quantify the amount of mRNA amplified or to prepare cDNA from the desired 
mRNA. Methods of reverse transcribing RNA into cDNA are well known and described in 
Sambrook et ai, 1989. Alternative methods for reverse transcription utilize thermostable, 
RNA-dependent DNA polymerases. These methods are described in WO 90/07641, filed 

15 December 21, 1990, incorporated herein by reference. Polymerase chain reaction 
methodologies are well known in the art. 

Another method for amplification is the ligase chain reaction ("LCR"), disclosed in EPA 
No. 320 308, incorporated herein by reference in its entirety. In LCR, two complementary 

20 probe pairs are prepared, and in the presence of the target sequence, each pair will bind to 
opposite complementary strands of the target such that they abut. In the presence of a ligase, 
the two probe pairs will link to form a single unit. By temperature cycling, as in PCR™, bound 
ligated units dissociate from the target and then serve as "target sequences" for ligation of 
excess probe pairs. U.S. Patent 4,883,750 describes a method similar to LCR for binding probe 

25 pairs to a target sequence. 

Qbeta Replicase, described in PCT Application No. PCT/US87/00880, incorporated 
herein by reference, may also be used as still another amplification method in the present 
invention. In this method, a replicative sequence of RNA that has a region complementary to 
30 that of a target is added to a sample in the presence of an RNA polymerase. The polymerase 
will copy the replicative sequence that can then be detected. 
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An isothermal amplification method, in which restriction endonucleases and ligases are 
used to achieve the amplification of target molecules that contain nucleotide 5'-[alpha-thioj- 
triphosphates in one strand of a restriction site may also be useful in the amplification of nucleic 
acids in the present invention. 

5 

Strand Displacement Amplification (SDA) is another method of carrying out isothermal 
amplification of nucleic acids which involves multiple rounds of strand displacement and 
synthesis, i.e., nick translation. A similar method, called Repair Chain Reaction (RCR), 
involves annealing several probes throughout a region targeted for amplification, followed by a 

10 repair reaction in which only two of the four bases are present. The other two bases can be 
added as biotinylated derivatives for easy detection. A similar approach is used in SDA. Target 
specific sequences can also be detected using a cyclic probe reaction (CPR). In CPR, a probe 
having 3' and 5' sequences of non-specific DNA and a middle sequence of specific RNA is 
hybridized to DNA that is present in a sample. Upon hybridization, the reaction is treated with 

15 RNase H, and the products of the probe identified as distinctive products that are released after 
digestion. The original template is annealed to another cycling probe and the reaction is 
repeated. 

Still another amplification methods described in GB Application No. 2 202 328, and in 
20 PCT Application No. PCT/US89/01025, each of which is incorporated herein by reference in its 
entirety, may be used in accordance with the present invention. In the former application, 
,, modified n primers are used in a PCR™-like, template- and enzyme-dependent synthesis. The 
primers may be modified by labeling with a capture moiety (e.g., biotin) and/or a detector 
moiety (e.g., enzyme). In the latter application, an excess of labeled probes are added to a 
25 sample. In the presence of the target sequence, the probe binds and is cleaved catalytically. 

After cleavage, the target sequence is released intact to be bound by excess probe. Cleavage of 
the labeled probe signals the presence of the target sequence. 

Other nucleic acid amplification procedures include transcription-based amplification 
30 systems (TAS), including nucleic acid sequence based amplification (NASBA) and 3SR 
Gingeras et ai, PCT Application WO 88/10315, incorporated herein by reference. In NASBA, 
the nucleic acids can be prepared for amplification by standard phenol/chloroform extraction. 
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heat denaturation of a clinical sample, treatment with lysis buffer and minispin columns for 
isolation of DNA and RNA or guanidinium chloride extraction of RNA. These amplification 
techniques involve annealing a primer which has target specific sequences. Following 
polymerization, DNA/RNA hybrids are digested with RNase H while double stranded DNA 
5 molecules are heat denatured again. In either case the single stranded DNA is made fully 
double stranded by addition of second target specific primer, followed by polymerization. The 
double-stranded DNA molecules are then multiply transcribed by an RNA polymerase such as 
T7 or SP6. In an isothermal cyclic reaction, the RNA's are reverse transcribed into single 
stranded DNA, which is then converted to double stranded DNA, and then transcribed once 
10 again with an RNA polymerase such as T7 or SP6. The resulting products, whether truncated 
or complete, indicate target specific sequences. 

Davey et ai, EPA No. 329 822 (incorporated herein by reference in its entirety) disclose 
a nucleic acid amplification process involving cyclically synthesizing single-stranded RNA 

15 ("ssRNA"), ssDNA, and double-stranded DNA (dsDNA), which may be used in accordance 
with the present invention. The ssRNA is a template for a first primer oligonucleotide, which is 
elongated by reverse transcriptase (RNA-dependent DNA polymerase). The RNA is then 
removed from the resulting DNA:RNA duplex by the action of ribonuclease H (RNase H, an 
RNase specific for RNA in duplex with either DNA or RNA). The resultant ssDNA is a 

20 template for a second primer, which also includes the sequences of an RNA polymerase 
promoter (exemplified by T7 RNA polymerase) 5' to its homology to the template. This primer 
is then extended by DNA polymerase (exemplified by the large "Klenow" fragment of E. colt 
DNA polymerase I), resulting in a double-stranded DNA ("dsDNA") molecule, having a 
sequence identical to that of the original RNA between the primers and having additionally, at 

25 one end, a promoter sequence. This promoter sequence can be used by the appropriate RNA 
polymerase to make many RNA copies of the DNA. These copies can then re-enter the cycle 
leading to very swift amplification. With proper choice of enzymes, this amplification can be 
done isothermally without addition of enzymes at each cycle. Because of the cyclical nature of 
this process, the starting sequence can be chosen to be in the form of either DNA or RNA. 

30 

Miller et aL, PCT Application WO 89/06700 (incorporated herein by reference in its 
entirety) disclose a nucleic acid sequence amplification scheme based on the hybridization of a 
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promoter/primer sequence to a target single-stranded DNA ("ssDNA") followed by 
transcription of many RNA copies of the sequence. This scheme is not cyclic, i.e., new 
templates are not produced from the resultant RNA transcripts. Other amplification methods 
include "RACE" and "one-sided PCR" (Frohman, 1990, incorporated by reference). 

Methods based on ligation of two (or more) oligonucleotides in the presence of nucleic 
acid having the sequence of the resulting "di-oligonucleotide", thereby amplifying the 
di-oligonucleotide, may also be used in the amplification step of the present invention. 

Following any amplification, it may be desirable to separate the amplification product 
from the template and the excess primer for the purpose of determining whether specific 
amplification has occurred. In one embodiment, amplification products are separated by 
agarose, agarose-acrylamide or polyacrylamide gel electrophoresis using standard methods. 
See Sambrook et aL, 1989. 

Alternatively, chromatographic techniques may be employed to effect separation. There 
are many kinds of chromatography which may be used in the present invention: adsorption, 
partition, ion-exchange and molecular sieve, and many specialized techniques for using them 
including column, paper, thin-layer and gas chromatography. 

Amplification products must be visualized in order to confirm amplification of the 
marker sequences. One typical visualization method involves staining of a gel with ethidium 
bromide and visualization under UV light. Alternatively, if the amplification products are 
integrally labeled with radio- or fluorometrically-labeled nucleotides, the amplification products 
can then be exposed to x-ray film or visualized under the appropriate stimulating spectra, 
following separation. 

In one embodiment, visualization is achieved indirectly. Following separation of 
amplification products, a labeled, nucleic acid probe is brought into contact with the amplified 
marker sequence. The probe preferably is conjugated to a chromophore but may be 
radiolabeled. In another embodiment, the probe is conjugated to a binding partner, such as an 
antibody or biotin, and the other member of the binding pair carries a detectable moiety. 
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In one embodiment, detection is by Southern blotting and hybridization with a labeled 
probe. The techniques involved in Southern blotting are well known to those of skill in the art 
and can be found in many standard books on molecular protocols. See Sambrook et al, 1989. 
5 Briefly, amplification products are separated by gel electrophoresis. The gel is then contacted 
with a membrane, such as nitrocellulose, permitting transfer of the nucleic acid and non- 
covalent binding. Subsequently, the membrane is incubated with a chromophore-conjugated 
probe that is capable of hybridizing with a target amplification product. Detection is by 
exposure of the membrane to x-ray film or ion-emitting detection devices. 

10 

One example of the foregoing is described in U.S. Patent No. 5,279,721, incorporated 
by reference herein, which discloses an apparatus and method for the automated electrophoresis 
and transfer of nucleic acids. The apparatus permits electrophoresis and blotting without 
external manipulation of the gel and is ideally suited to carrying out methods according to the 
1 5 present invention. 

All the essential materials and reagents required for detecting P-TEFb or kinase protein 
markers in a biological sample may be assembled together in a kit. This generally will 
comprise preselected primers for specific markers. Also included may be enzymes suitable for 
20 amplifying nucleic acids including various polymerases (RT, Taq, etc.), deoxynucleotides and 
buffers to provide the necessary reaction mixture for amplification. 

Such kits generally will comprise, in suitable means, distinct containers for each 
individual reagent and enzyme as well as for each marker primer pair. Preferred pairs of 

25 primers for amplifying nucleic acids are selected to amplify the sequences specified in SEQ ID 
NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID NO: 12 
or SEQ ID NO: 14 or SEQ ID NO: 16 such that, for example, nucleic acid fragments are 
prepared that include a contiguous stretch of nucleotides identical to for example about 15, 20, 
25,30,35, etc.; 48, 49,50, 51, etc.; 75, 76, 77, 78. 7<>. NO c/c-.; 100, 101, 102, 103 etc.; 118, 119, 

30 120, 121 etc.; 127, 128, 129, 130, 131, etc.; 316, 317. 318, 319, etc.; 322, 323, 324, 325, 326, 
etc.; 361, 362, 363, 364, etc.; 372, 373, 374, 375, etc. of SEQ ID NO:2 or SEQ ID NO:4 or SEQ 
ID NO:6 or SEQ ID NO:8 or SEQ ID NO:I0 or SEQ ID NO:12 or SEQ ID NO:14 or SEQ ID 



.SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



NO: 16, so long as the selected contiguous stretches are from spatially distinct regions. Similar 
fragments may be prepared which are identical or complimentary to, for example, SEQ ID 
NO: 1 such that the fragments do not hybridize to, for example, SEQ ID NO:3. 

5 In another embodiment, such kits will comprise hybridization probes specific for UspAl 

or UspA2 proteins chosen from a group including nucleic acids corresponding to the sequences 
specified in SEQ ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID 
NO: 10 or SEQ ID NO: 12 or SEQ ID NO: 14 or SEQ ID NO: 16 or to intermediate lengths of the 
sequences specified. Such kits generally will comprise, in suitable means, distinct containers 
10 for each individual reagent and enzyme as well as for each marker hybridization probe. 

8.2 Other Assays 

Other methods for genetic screening to accurately detect M. catarrhal is infections that 
alter normal cellular production and processing, in genomic DNA, cDNA or RNA samples may 
1 5 be employed, depending on the specific situation. 

For example, one method of screening for genetic variation is based on RNase cleavage 
of base pair mismatches in RNA/DNA and RNA/RNA heteroduplexes. As used herein, the 
term "mismatch" is defined as a region of one or more unpaired or mispaired nucleotides in a 
20 double-stranded RNA/RNA, RNA/DNA or DNA/DNA molecule. This definition thus includes 
mismatches due to insertion/deletion mutations, as well as single and multiple base point 
mutations. 

U.S. Patent No. 4,946,773 describes an RNase A mismatch cleavage assay that involves 
25 annealing single-stranded DNA or RNA test samples to an RNA probe, and subsequent 
treatment of the nucleic acid duplexes with RNase A. After the RNase cleavage reaction, the 
RNase is inactivated by proteolytic digestion and organic extraction, and the cleavage products 
are denatured by heating and analyzed by electrophoresis on denaturing polyacrylamide gels. 
For the detection of mismatches, the single-stranded products of the RNase A treatment, 
30 electrophoretically separated according to size, are compared to similarly treated control 
duplexes. Samples containing smaller fragments (cleavage products) not seen in the control 
duplex are scored as +. 
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Currently available RNase mismatch cleavage assays, including those performed 
according to U.S. Patent No. 4,946.773, require the use of radiolabeled RNA probes. Myers 
and Maniatis in U.S. Patent No. 4,946.773 describe the detection of base pair mismatches using 
5 RNase A. Other investigators have described the use of £. coli enzyme, RNase I, in mismatch 
assays. Because it has broader cleavage specificity than RNase A, RNase I would be a desirable 
enzyme to employ in the detection of base pair mismatches if components can be found to 
decrease the extent of non-specific cleavage and increase the frequency of cleavage of 
mismatches. The use of RNase I for mismatch detection is described in literature from Promega 
10 Biotech. Promega markets a kit containing RNase I that is shown in their literature to cleave 
three out of four known mismatches, provided the enzyme level is sufficiently high. 

The RNase protection assay was first used to detect and map the ends of specific rnRNA 
targets in solution. The assay relies on being able to easily generate high specific activity 

15 radiolabeled RNA probes complementary to the rnRNA of interest by in vitro transcription. 
Originally, the templates for in vitro transcription were recombinant plasmids containing 
bacteriophage promoters. The probes are mixed with total cellular RNA samples to permit 
hybridization to their complementary targets, then the mixture is treated with RNase to degrade 
excess unhybridized probe. Also, as originally intended, the RNase used is specific for single- 

20 stranded RNA, so that hybridized double-stranded probe is protected from degradation. After 
inactivation and removal of the RNase, the protected probe (which is proportional in amount to 
the amount of target rnRNA that was present) is recovered and analyzed on a polyacrylamide 
gel. 

25 The RNase Protection assay was adapted for detection of single base mutations. In this 

type of RNase A mismatch cleavage assay, radiolabeled RNA probes transcribed in vitro from 
wild type sequences, are hybridized to complementary target regions derived from test samples. 
The test target generally comprises DNA (either genomic DNA or DNA amplified by cloning in 
plasmids or by PCR™), although RNA targets (endogenous rnRNA) have occasionally been 

30 used. If single nucleotide (or greater) sequence differences occur between the hybridized probe 
and target, the resulting disruption in Watson-Crick hydrogen bonding at that position 
("mismatch") can be recognized and cleaved in some cases by single-strand specific 
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ribonuclease. To date, RNase A has been used almost exclusively for cleavage of single-base 
mismatches, although RNase I has recently been shown as useful also for mismatch cleavage. 
There are recent descriptions of using the MutS protein and other DNA-repair enzymes for 
detection of single-base mismatches. 

5 

9.0 Examples 

The following examples are included to demonstrate preferred embodiments of the 
invention. It should be appreciated by those of skill in the art that the techniques disclosed in 
the examples which follow represent techniques discovered by the inventor to function well in 
10 the practice of the invention, and thus can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments which are disclosed and still obtain 
a like or similar result without departing from the spirit and scope of the invention. 

EXAMPLE I: Sequence Analysis and Characterization of uspAl 

15 Bacterial strains and culture conditions. M. catarrhalis strains 035E, 046E, TTA24, 

012E, FR2682, and B21 have been previously described (Helminen et ai, 1993a; Helminen et 
ai, 1994; Unhanand et ai t 1992). M catarrhalis strains FR3227 and FR2336 were obtained 
from Richard Wallace, University of Texas Health Center, Tyler, TX. M catarrhalis strain B6 
was obtained from Elliot Juni, University of Michigan, Ann Arbor, MI. M catarrhalis strain 

20 TTA1 was obtained from Steven Berk, East Tennessee State University, Johnson City, TN. 
M. catarrhalis strain 25240 was obtained from the American Type Culture Collection, 
Rockville, MD. M. catarrhalis strains were routinely cultured in Brain Heart Infusion (BHI) 
broth (Difco Laboratories, Detroit, MI) at 37°C or on BHI agar plates in an atmosphere of 
95% air-5% C0 2 . Escherichia coli strains LE392 and XLl-Blue MRF' (Stratagene, La Jolla, 

25 CA) were grown on Lubria-Bertani medium (Maniatis et ai, 1982) supplemented with maltose 
(0.2% w/v) and 10 mM MgS0 4 at 37°C ? with antimicrobial supplementation as necessary. 

Monoclonal antibodies (MAbs) . MAb 17C7 is a murine IgG antibody reactive with the 
UspA proteinaceous material of all M, catarrhalis strains tested to date (Helminen et ai, 1994). 
Additional MAbs specific for UspA material (i.e., 16A7, 17B1, and 5C12) were produced for 
30 this study by fusing spleen cells from mice immunized with outer membrane vesicles from 
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M. catarrhalis 035E with the SP2/0-Agl4 plasmacytoma cell line, as described (Helminen et 
ai, 1993a). These MAbs were used in the form of hybridoma culture supernatant fluid in 
western blot and dot blot analyses. 

Cloning vectors . Plasmid and bacteriophage cloning vectors utilized in this work and 
5 the recombinant derivatives of these vectors are listed in Table VI. 

TABLE VI 



Bacteriophages And Plasmids 



Bacteriophage or plasmid 


Description 


Source 


Bacteriophage 






LambdaGEM-11 


Cloning vector 


Promega Corp. 
(Madison, WI) 


MEH200 


LambdaGEM-1 1 containing an 


(Helminen et ai. 




1 1 kb insert of M catarrhalis 


1994) 




strain 03 5E DNA encoding the 






UspA proteinaceous material 




ZAP Express 


Cloning vector 


Stratagene 


USP-100 


ZAP Express with a 2.7 kb 
fragment of DNA (containing 
the uspAl) amplified from the 
chromosome of M. catarrhalis 
strain 03 5 E 


This study 


Plasmids 






pBluescript II SK+ 


Cloning vector, Amp K 


Stratagene 


(pBS) 






pJL501.6 


pBS containing the 1.6 kb 
Bglll-EcoRI fragment from 
MEH200 


This study 


pJL500.5 


pBS containing the 600-bp BglU 
fragment from MEH200 


This study 



■aivc-toonry <\no ofPRTaiA? n > 
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MEH200, the original recombinant bacteriophage clone that produced plaques reactive with the 
UspA-specific MAb 17C7, has been described previously (Helminen et aL t 1994). 

Genetic techniques. Standard recombinant DNA techniques including plasmid isolation, 
restriction enzyme digestions, DNA modifications, ligation reactions and transformation of 
E. coli are familiar to those of skill in the art and were performed as previously described 
(Maniatis e/a/.. 1982; Sambrook et al., 1989). 

Polymerase Chain Reaction (PCR™\ PCR™ was performed using the GeneAmp kit 
(Perkin-Elmer, Branchberg, NJ). All reaction were carried out according to the manufacturer's 
instructions. To amplify products from total genomic DNA, 1 jag of M catarrhalis 
chromosomal DNA and 100 ng of each primer were used in each 100 \x\ reaction. 

Nucleotide sequence analysis . Nucleotide sequence analysis of DNA fragments in 
recombinant plasmids, in bacteriophage, or derived by PCR™ was performed using an Applied 
Biosystems Model 373A automated DNA sequencer (Applied Biosystems, Foster City, CA). 
DNA sequence information was analyzed using the Intelligenetics suite package and programs 
from the University of Wisconsin Genetics Computer Group software analysis package 
(Devereux et al. t 1984). Analysis of protein hydrophilicity using the method of Kyte and 
Doolittle (1982) and analysis of repeated amino acid sequences within the UspA protein was 
performed using the Mac Vector™ software protein matrix analysis package (Eastman Kodak 
Company, Rochester, NY). 

Identification of recombinant bacteriophage , Lysates were generated from E. coli cells 
infected with recombinant bacteriophage by using the plate lysis method as described 
(Helminen et al., 1994). MAb-based screening of plaques formed by recombinant ZAP Express 
bacteriophage on E. coli XL 1 -Blue MRF' cells was performed according to the manufacturer's 
instructions (Stratagene, La Jolla, CA). Briefly, nitrocellulose filters soaked in 10 mM IPTG 
were applied to the surface of agar plates five hours after bacteriophage infection of the 
bacterial lawn. After overnight incubation at 37°C\ the nitrocellulose pads were removed, 
washed with PBS containing 0.5% (v/v) Twccn 20 and 5% (vv/v) skim milk (PBS-T) and 
incubated with hybridoma culture supernatant containing the MAb for 4 hours at room 
temperature. After four washes with PBS-T, PBS-T containing l25 I-labeled goat anti-mouse 
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IgG was applied to each pad. After overnight incubation at 4°C, the pads were washed four 
times with PBS-T, blotted dry, and exposed to film. 

Characterization of M catarrhalis protein antigens . Outer membrane vesicles were 
prepared from BHI broth-grown M catarrhalis cells by the EDTA-buffer method (Murphy and 
Loeb, 1989). Proteins present in these vesicles were resolved by sodium dodecyl sulfate 
(SDS)-polyacrylamide gel electrophoresis (PAGE) using 7.5% (w/v) polyacrylamide separating 
gels. These SDS-PAGE-resolved proteins were electrophoretically transferred to nitrocellulose 
and western blot analysis was performed as described using MAb 17C7 as the primary antibody 
(Kimura et ai t 1985). For western blot analysis of proteins encoded by DNA inserts in 
recombinant bacteriophage, one part of a lysate from bacteriophage-infected E. coli cells was 
mixed with one part SDS-digestion buffer (Kimura et al., 1985) and this mixture was incubated 
at 37°C for 15 minutes prior to SDS-PAGE. 

Features of the aspAl gene and its encoded protein product . The nucleotide sequence of 
the M. catarrhalis 035E uspAl gene and the deduced amino acid sequence of the UspAl protein 
are provided in SEQ ID NO:2 and SEQ ID NOT, respectively. The open reading frame (ORF), 
containing 2,493 nucleotides , encoded a protein product of 831 amino acids , with a calculated 
molecular mass of 88,271 daltons. 

The predicted protein product of the uspAl ORF had a pi or 4.7, was highly hydrophilic. 
and was characterized by extensively repeated motifs. The first motif consists of the consensus 
sequence NXAXXYSXIGGGXN (SEQ ID NO:24), which is extensively repeated between 
amino acid residues 80 and 170. The second region, from amino acid residues 320 to 460, 
contains a long sequence which is repeated three times in its entirety, but which also contains 
smaller units which are repeated several times themselves. This "repeat within a repeat" 
arrangement is also true of the third region, which extends from amino acid residues 460 to 600 
. This last motif consists of many repeats of the small motif QADI (SEQ ID NO:25) and two 
large repeats which contain the QADI (SEQ ID NO:25) motif within themselves. 

Similarity of UspAl to other proteins . A BLAST-X search (Altschul et al } 1990; Gish 
and States, 1993) of the available databases for proteins with significant homology to UspAl 
indicated that the prokaryotic proteins that were most similar to this M. catarrhalis antigen were 
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a putative adhesin of H. influenzae Rd (GenBank accession number U32792) (Fleischmann et 
ai, 1995), the Hia adhesin from nontypable /-/. influenzae (GenBank accession number 
U38617) (Barenkamp and St. Geme III, 1996), and the YadA invasin of Yersinia enterocolitica 
(Skurnik and Wolf-Watz, 1989) (SwissProt:P3 1489). When the GAP alignment program 
(Devereux et al. t 1984) was used to compare the UspAl sequence to that of these and closely 
related bacterial adhesins, UspAl proved to be 25% identical and 47% similar to the E. coli 
AIDA-I adhesin from enteropathogenic E. coli (Benz and Schmidt, 1989; Benz and Schmidt, 
1992b), 23% identical and 46% similar to Hia (Barenkamp and St. Geme III, 1996), and 24% 
identical and 43% similar to YadA (Skurnik and Wolf-Watz, 1989). Other proteins retrieved 
from database searches as having homology with UspAl included myosin heavy chains from a 
number of species. 

EXAMPLE II: Two Genes Encode the Proteins UspAl and UspA2 

MAb 17C7 binds to a very high molecular weight proteinaceous material of M. 
catarrhalis, designated UspA, that migrates with an apparent molecular weight (in SDS-PAGE) of 
at least 250 kDa. This same MAb also reacts with another antigen band of approximately 100 
kDa, as described in U.S. Patent No. 5,552,146 and incorporated herein by reference, and it is 
bound by a phage lysate from E. coli infected by a recombinant bacteriophage that contained a 
fragment of M. catarrhalis chromosomal DNA. The M catarrhalis proteinaceous material in the 
phage lysate that binds this MAb migrates at a rate similar or indistinguishable from that of the 
native UspA material (Helminen et aL, 1 994). 

Analysis of uspAl. Nucleotide sequence analysis of the PL catarrhalis strain 035E gene 
expressed by the recombinant bacteriophage, designated uspAl, revealed the presence of an ORF 
encoding a predicted protein product with a molecular mass of 88,271 (SEQ ID NO: 1 ). The use 
of the uspAl ORF in an in vitro DNA-directed protein expression system revealed that the protein 
encoded by the uspAl gene migrated in SDS-PAGE with an apparent molecular weight of about 
120 kDa. (Those of skill in the art will be aware that denaturing processes, such as SDS-PAGE, 
can alter the migration rate of proteins such that the apparent molecular weight of the denatured 
protein is somewhat different than the predicted molecular weight of the non-denatured protein.) 
In addition, when the uspAl ORF was introduced into a bacteriophage vector, the recombinant^. 
coli strain containing this recombinant phage expressed a protein that migrated in SDS-PAGE 
apparently at the same rate as the native UspA protein from M. catarrhalis, 
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Southern blot analysis of chromosomal DNA from several M. catarrhalis strains, using a 
0.6 kb Bglll-PvuW fragment derived from the cloned uspAl gene as the probe, revealed that, with 
several strains, there were two distinct restriction fragments that bound this uspAl -derived probe 
5 (FIG. 1 ), indicating that M. catarrhalis possessed a second gene had some similarity to the uspAl 
gene. 

Native very high molecular weight UspA proteinaceous material from M. catarrhalis 
strain 035E was resolved by SDS PAGE, electroeluted,and digested with a protease. N-terminal 
10 acid sequence analysis of some of the resultant peptides revealed that the amino acid sequences of 
several peptides did not match that of the deduced amino acid sequence of UspAl . Other peptides 
obtained from this experiment were similar to those present in the deduced amino acid sequence 
but not identical. 

15 Protease and cyanogen bromide fCNBr) Cleavage of High Molecular Weight UspA 

Proteinaceous Material: Three tenths (0.3) mg of purified very high molecular weight UspA 
proteinaceous material (at the time of the purification this material was thought to be a single 
protein) was precipitated with 90% ethanol and the pellet was resuspended in 100 ml of 88% 
formic acid containing 12M urea. Following resuspension, 100 ml of 88% formic acid 

20 containing 2M CNBr was added and the mixture was incubated in the dark overnight at room 
temperature. One ml (2.0 mg) of purified UspA material was added directly to a vial containing 
25 mg of either trypsin or chymotrypsin. The reaction mixtures were incubated for -48 hours, 
at 37°C. One ml (2.0 mg) of purified UspA material was added directly to a vial containing 15 
mg of endoproteinase Lys-C. The reaction mixtures were incubated for about 48 hours at 37°C. 

25 

The cleavage reaction mixtures were clarified by centrifugation in an Eppendorf™ 
centrifuge at 12,000 rpm for 5 minutes. The clarified supernatant was loaded directly onto a 
Vydac C4 HPLC column using a mobile phase of 0.1% (v/v) aqueous trifluoroacetic acid 
(Solvent A) and acetonitrile:H 2 0:trifluoroacetic acid, 80:20:0.1 (v/v/v) (Solvent B) at a flow 
30 rate of 1.0 ml/min. The reaction mixtures were washed onto the column with 100% Solvent A 
followed by elution of cleavage fragments using a 30 minutes linear gradient (0-100%) of 
Solvent B. Fractions were collected manually, dried overnight in a Speed-Vac and resuspended 
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in House Pure Water. The resuspended HPLC-separated fractions were subjected to SDS- 
PAGE analysis using 10-18% gradient gels in a Tris-Tricine buffer system. The fractions 
which exhibited a single peptide band were submitted for direct N-terminal sequence analysis. 
Fractions displaying multiple peptide bands were transferred from SDS-PAGE onto a PVDF 
membrane and individual bands excised and submitted for N-terminal sequence analysis. 

The N-terminal amino acid sequences of these fragments then were determined using an 
Applied Biosystems Model 477A PTH Analyzer (Applied Biosystems, Foster City, CA, 
U.S.A.). A summary of these sequences is given in Table VII. About half of the sequences 
were found to match the sequence deduced from the uspAl gene, while the other half did not. 
Attempts at shifting the reading frame of the uspAI gene sequence failed to account for the non- 
matching peptide sequences, indicating that the high molecular weight UspA protein may 
comprise either a multimer of more than one distinct protein or distinct multimers of two 
different proteins. 
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TABLE VII 



Summary of the N-terminal Sequences of Internal Peptide Fragments 



Digest 


Sequence' 1 


CNBr 


AAQAALSGLFVPYSVGKFNATAALGGYGSK SEQ ID NO:26 
GKITKNAARQENG SEQ ID NO:27 


LysC Digest #1 


VIGDLGRKV SEQ ID NO:28 
ALEXNVEEGL SEQ ID NO:29 
ALESNVEEGLXXLS SEQ ID NO:30 
ALEFNGE SEQ ID NO:3 1 


LysC Digest #2 


SITDLGXKV SEQ ID NO:32 
SITDLGTIVDGFXXX SEQ ID NO:33 
SITDLGTIVD SEQ ID NO:34 


Trypsin 


VDALXTKVNALDXKVNSDXT SEQ ID NO:35 
LLAEQQLNGKTLTPV SEQ ID NO:36 
AKHDAASTEKGKMD SEQ ID NO:37 
ALESNVEEGLLDLSG SEQ ID NO:38 


Trypsin Digest #1 


NQNTLIEKTANK SEQ ID NO:39 
IDKNEYSIK SEQ ID NO:40 
SITDLGTK SEQ ID NO:4 1 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 



71 



PCT7US97/23930 



TABLE VII (Continued) 



Digest 


Sequence" 


Trypsin Digest #2 


NQNTLIEK SEQ ID NO:42 
ALHEQQLETLTK SEQ ID NO:43 
NSSD SEQ ID NO:44 
NKADADASFETLTK SEQ ID NO:45 
FAATAIAKDK SEQ ID NO:46 
KASSENTQNIAK SEQ ID NO:47 
RLLDQK SEQ ID NO:48 


Chymotrypsin 


AATADAITKNGX SEQ ID NO:49 
AKAXAANXDR SEQ ID NO:50 


Digest of research grade 
UspA with cys-C- 
endopeptidase 


NQADIAQNQTDIQDLAAYNELQ SEQIDNO:51 
NQADIANNINNIYELAQQQDQ SEQ ID NO: 52 
YNERQTEAIDALN SEQ ID NO:53 
ILGDTAIVSNSQD SEQ ID NO:54 



Certain residues of several peptides could not be verified and these ambiguities are shown by an 
"X" in SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:35, 
SEQ ID NO:49 and SEQ ID NO:50. In SEQ ID NO:29 the ambiguous residue is likely to be a 
serine; in SEQ ID NO:33, position 13 is likely to be aspartic acid, position 14 is likely to be 
glycine and position 1 5 is likely to be arginine; in SEQ ID NO:35 both positions 1 3 and 1 9 are 
likely to be serines; in SEQ ID NO:49 the ambiguous residue is likely to be an asparagine; and 
in SEQ ID NO:50 position 4 is likely to be serine and position 8 is likely to be threonine. 

Additional attempts to resolve the very high molecular weight UspA protein band from M. 
catarrhalis strain 035E by SDS-PAGE, followed by electrocution and digestion with proteases 
or with cyanogen bromide, again yielded a number of peptides which were sequenced. Several 
peptides (peptides 1-6, Table VIII) were obtained. The amino acid sequence of which was 
identical or very similar to that deduced from the nucleotide sequence of the uspA! gene. 
However, several additional peptides, peptides 7-10, Table VIII, were not present in the deduced 
amino acid sequence. This finding substantiated the suggestion that a second protein was present 
in the UspA antigen preparation. 
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TABLE VIII 



Matching or closely matching peptides: 



Peptide U Amino acid sequence 



Peptide 1 


KALESNVEEGLLDLSGR 


(SEQ IDNO:55) 


Peptide 2 


ALESNVEEGLLELSGRTIDQR 


(SEQ ID NO:56) 


Peptide 3 


NQAHIANNINXIYELAQQQDQK 


(SEQ IDNO:57) 


Peptide 4 


NQADIAQNQTDIQDLAAYNELQ 


(SEQ IDNO:58) 


Peptide 5 


ATHDYNERQTEA 


(SEQ ID NO:59) 


Peptide6 


KASSENTQNIAK 


(SEQ ID NO:60) 


Nonmatching peptides: 


Peptide # 


Amino acid sequence 


Peptide 7 


MILGDTAIVSNSQDNKTQLKFYK 


(SEQ IDNO:61) 


Peptide 8 


AGDTIIPLDDDXXP 


(SEQ ID NO:62) 


Peptide 9 


LLHEQQLXGK 


(SEQ ID NO:63) 


Peptide 10 


IFFNXG 


(SEQ ID NO:64) 



Certain residues of several peptides could not be verified and these ambiguities are shown by an 



"X" in SEQ ID NO:57, SEQ ID NO:62, SEQ ID NO:63 and SEQ ID NO:64. 

Further evidence corroborating the assertion that the high molecular weight UspA 
5 proteinaceous material was either a multimer of more than one distinct protein or distinct 
multimers of two different proteins was derived from earlier electrospray mass spectroscopic 
analysis which predicted that a monomer of the UspA material had a molecular weight of 
59,500. This approximately 60 kDa protein reacted immunogenically with the MAbs 17C7, 45- 
2, 13-1, and 29-31, in contrast to the UspAl protein which only cross-reacted with MAb 17C7. 
10 The fact that MAb 17C7 reacted with both isolated proteins suggested that this Mab recognized 
an epitope common to both proteins. 

Preparation of mutant uspA I construct. The nucleotide sequence of the cloned uspAl gene 
was used to construct an isogenic uspAl mutant. Oligonucleotide primers (£a/?*HI-ended PI and 
1 5 P16 in Table IX) were used to amplify a truncated version of the uspAl ORF from M. catarrhalis 
strain 035E chromosomal DNA; this PCR™ product was cloned into the BamHl site of the 
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plasmid vector pBIuescript II SK-K A 0.6 kb DgH\ fragment from the middle of this cloned 
fragment was excised and was replaced by a /?<3//?HI-ended cassette encoding kanamycin 
resistance. This new plasmid was grown in E. coli DH5a, purified by column chromatography, 
linearized by digestion with ZTcoRI, precipitated, and then dissolved in water. This linear DNA 
molecule was used to electroporate the wild-type M. catarrhalis strain 035E, using a technique 
described previously (Helminen et aL, 1993b). Approximately 5,000 kanamyc in-resistant 
transformants were obtained; several picked at random were found to be still reactive with MAb 
1 7C7. One of these kanamycin-resistantclones was randomly chosen for further examination and 
Southern blot analysis confirmed that this mutant was isogenic. 

Analysis of products expressed by the uspAl mutant When whole cell lysates of both the 
wild-type M catarrhalis strain and this mutant were subjected to SDS-PAGE, both the wild-type 
strain and the mutant strain still expressed the very high-molecular-weight band originally 
designated as UspA. However, a protein of approximately 120 kDa was found to be missing in 
the mutant strain (FIG. 2A). The fact that both this mutant and the wild-type parent strain still 
expressed a very high molecular weight antigen reactive with MAb 1 7C7 (FIG. 2B) indicated that 
there had to be a second gene in M. catarrhalis strain 035E that encoded a MAb 17C7-reactive 
antigen. Furthermore, it should be noted that EDTA-extracted outer membrane vesicles of both 
the wild-type strain (FIG. 2C, lanes 5 and 7) and mutant strain (FIG. 2C, lanes 6 and 8) possessed 
a protein of approximately 70-80 kDa that was reactive with MAb 17C7. This approximately 70- 
80 kDa band likely represents one form, perhaps the monomeric form, of the product of a second 
gene encoding the MAb 17C7-reactive epitope. 

It is important to note that, when chromosomal DNA from both the wild-type parent strain 
and the mutant were digested with Pvull and probed in Southern blot analysis with a 0.6 kb Bglll- 
Pvull fragment derived from the uspAl gene, the wild-type strain exhibited a 2.6 kb band and a 
2.8 kb band which bound this probe (FIG. 3). In contrast, the mutant strain had a 2.6 kb band and 
a 3.4 kb band that bound this probe. The presence of the 3.4 kb band was the result of the 
insertion of the kan cartridge into the deletion site in the uspAl gene. 
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EXAMPLE III: Characterization of UspA2 and uspA2 

Construction of fusion proteins. The epitope which binds MAb 1 7C7 was localized by 
using the nucleotide sequence of the uspAl gene described above to construct fusion proteins. 
First, fusion proteins containing five peptides spanning the UspA I protein were constructed by 
using the pGEX4T-2 protein fusion system (Pharmacia LKB). The oligonucleotide primers 
used in PCR™ to amplify the desired nucleotide sequences from M. catarrhalis strain 035E 
chromosomal DNA are listed in Table IX. Each of these had either a BamWl site or a Xhol site 
at the 5' end, thereby allowing directional in-frame cloning of the amplified product into the 
BamHl- and A7?oI-digested vector. When recombinant E. coli strains expressing each of these 
five fusion proteins were used in a colony blot radioimmunoassay, only fusion protein MF-4 
readily bound MAb 17C7. Further analysis of the uspA /-derived nucleotide sequence in the 
MF-4 fusion construct involved the production of fusion proteins containing 79 amino acid 
residues (MF-4-1) and 123 amino acid residues (MF-4-2) derived from the MF-4 fusion protein 
(Table IX). These two fusion proteins both bound MAb 17C7 (Table IX). FIG. 4 depicts the 
western blot reactivity of MAb 17C7 with the MF-4-1 fusion protein. These two fusion 
proteins had in common only a 23-residue region NNINNIYELAQQQDQHSSDIKTL (SEQ ID 
NO:65), suggesting that this 23-residue region, designated as the "3Q" peptide, contains the 
epitope that binds MAb 17C7. 
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TABLE IX 

PCR™ primers used for the production of usp Al gene fragments for use in the 
construction of fusion proteins and mutagenesis and the reactivity of the resulting fusion 

protein with MAb 17C7 



Fragment Generated: 


Primer Pair 3 


Reactivity with MAb 1 7C7 


MF-3 


P5-P8 




MF-4 


P6-P13 


+ 


MF-4.1 


P7-P12 


-t- 


MF-4.2 


P11-P13 


-l- 



5 a primer sequences are as follows: 



P5 


GGTGCAGGTCAGATCAGTGAC 


SEQ ID NO:66 


P6 


GCCACCAACCAAGCTGAC 


SEQ ID NO:67 


P7 


AGCGGTCGCCTGCTTGATCAG 


SEQ ID NO:68 


P8 


CTGATCAAGCAGGCGACCGCT 


SEQ ID NO:69 


Pll 


CAAGATCTGGCCGCTTACAA 


SEQ ID NO:70 


PI2 


TTGTAAGCGGCCAGATCTTG 


SEQ IDNO:71 


P13 


TGCATGAGCCGCAAACCC 


SEQ fD NO:72 



Elucidation of the MAb 17C7 Epitope. It is important to note that the nucleotide sequence 
encoding this 23-residue polypeptide {i.e., the 3Q peptide) was present in the 0.6 kb BglU-PvuW 
1 5 fragment used in the Southern blot analysis described in Example II. This finding suggested that 
the epitope that bound MAb 1 7C7 might be encoded by DNA present in both the 2.6 and 2.8 kb 
Pvull fragments, with the 2.8 kb Pvull fragment being derived from the cloned uspAl gene and 
the 2.6 kb Pvull fragment representing all or part of another gene encoding this same epitope. 

A ligation-basedPCR™ system was used to verify this finding. Chromosomal DNA from 
20 the mutant strain was digested to completion with /V//II and was resolved by agarose gel 
electrophoresis. Fragments ranging in size from 2-3 kb were excised from the agarose, blunt- 
ended, and ligated into the EcoRV site in pBluescript II SK+ This ligation reaction mixture was 
precipitated and used in a PCR™ amplification reaction. Each PCR™ reaction contained either 
the T3 or T7 primer derived from the DNA encoding the 3Q peptide. This approach yielded a 1 .7 
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kb product with the T3 and P10 primers and a 0.9 kb product from the T7 and P9 primers (FIG. 
5). The sum of these two bands is the same as the 2.6 kb size of the desired DNA fragment. 

Nucleotide sequence analysis of these two PCR™ products revealed two incomplete ORFs 
which, when joined at the region encoding the 3Q peptide, formed a 1,728-bp ORF encoding a 
protein with a calculated molecular weight of 62,483 daltons (SEQ ID NO:3). The amino acid 
sequence of this protein had 43% identity with that of UspA 1 . Closer examination revealed that a 
region extending from amino acids 278-41 1 in this second protein, designated UspA2, was nearly 
identical to the region in UspAl between amino acids 505-638 (SEQ ID NO:l). Furthermore, 
these two regions both contain the 23-mer (the 3Q peptide) that likely contains the epitope that 
binds MAb 17C7. It should also be noted that the four peptides from Table IX (Peptides 7-10) 
that were not found in UspA 1 were found to be identical or very similar to peptides in the deduced 
amino acid sequence of UspA2. In addition, the first six peptides listed in Table IX, which 
matched or were very similar to peptides in the deduced amino acid sequence of UspAl, also 
matched peptides found in the deduced amino acid sequence of UspA2. 

Oligonucleotide primers PI and P2 (Table IX) were used to amplify a 2.5-2.6 kb fragment 
from M catarrhalis strain 035E chromosomal DNA. Nucleotide sequence analysis of this 
PCR™ product was used to confirm the nucleotide sequence of the uspA2 ORF determined from 
the ligation-based PCR™ study. These results proved that M. catarrhalis strain 035E contains 
two different ORFs (i.e., uspAJ and uspAI) which encode the same peptide (i.e., the 3Q peptide) 
which likely binds MAb. 17C7. This 3Q peptide appears twice in UspAl and once in UspA2 
(SEQ ID NO: 1 and SEQ ID NO:3). 

The nucleotide sequences of the two DNA segments encoding these 3Q peptides in uspAI 
are nearly identical, with three nucleotides being different. These nucleotide differences did not 
cause a change in the amino acid sequence. The nucleotide sequence of the DNA segment 
encoding the 3Q peptide in uspAI is identical to the DNA encoding the first 3Q peptide in UspAl . 

As seen in FIG. 2C, lane 7, the three dominant MAb 1 7C7-reactive bands present in M. 
catarrhalis strain 035E outer membrane vesicles have apparent molecular weights of greater than 
200 kDa, approximately 120 kDa, and approximately 70-80 kDa. It should be noted that the 
existence of several MAb 17C7-reactive bands, with apparent molecular weights of greater than 
200 kDa, approximately 120 kDa, and approximately 70-80 kDa was also apparent in U.S. Patent 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



5,552,146 (FIG. 1, lane H). Therefore, the existence of at least more than one M. catarrhalis 
antigens reactive with MAb 17C7 was apparent as early as 1991. It is now apparent that the 
approximately 120 kDa band likely represents the monomeric form of the UspAl antigen and the 
approximately 70-80 kDa band likely represents the monomeric form of the UspA2 antigen from 
5 M catarrhalis strain 035E. One or more than one of these species may aggregate to form the 
very high molecular weight proteinaceous material (i.e. greater than 200 kDa) of the UspA 
antigen. 

A new M. catarrhalis strain 035E genomic library was constructed in the bacteriophage 
vector ZAP Express (Stratagene, La Jolla, CA). Chromosomal DNA from this strain was partially 

10 digested with Sau3Al and 4-9 kb DNA fragments were ligated into the vector arms according to 
the instructions obtained from the manufacturer. This library was amplified in E. coli MRF'. An 
aliquot of this library was diluted and plated and the resultant plaques were screened for reactivity 
with MAb 17C7. Approximately 24 plaques which bound this MAb were detected; the 
responsible recombinant bacteriophage were purified by the single plaque isolation method, and 

15 the DNA insert from one of these bacteriophage was subjected to nucleotide sequence analysis. 
Nucleotide sequence of the 2.6 kb DNA fragment present in this recombinant bacteriophage 
revealed that, on one end, it contained an incomplete ORF that encoded the 3Q peptide. Until its 
truncation by the vector cloning site, the sequence of this incomplete ORF was identical or nearly 
identical to that of the uspA2 ORF derived from the ligation-based PCR™ study described 

20 immediately above, providing further evidence that two genes which share a common epitope 
encode the UspA antigen. 

EXAMPLE IV: Purification of and Immunological Properties of the 
Proteins UspAl and UspA2 

Materials and Methods 

25 Bacteria. TTA24 and 035E isolates were as previously described in Example I. 

Additional isolates were obtained from the University of Rochester and the American Type 
Culture Collection (ATCC). The bacteria were routinely passaged on Mueller-Hinton agar 
(Difco, Detroit, Ml) incubated at 35°C with 5% carbon dioxide. The. bacteria used for the 
purification of the protein were grown in sterile broth containing 10 g casamino acids (Difco, 

30 Detroit, Ml) and 15 g yeast extract (BBL, Cockeysville, MD) per liter. The isolates were stored 
at -70°C in Mueller-Hinton broth containing 40% glycerol. 
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Purification of UspA2 . Bacterial cells (-400 g wet wt. of A/, catarrhalis 035E) were 
washed twice with 2 liters of pH 6.0, 0.03 M sodium phosphate (NaP0 4 ) containing 1.0% 
Triton* X-100 (TX-100) (J.T. Baker Inc., Philipsburg, NJ) (pH 6.0) by stirring at room 
temperature for 60 min. Cells containing the UspA2 protein were pelleted by centrifugation at 
5 13,700 x g for 30 min at 4°C. Following centrifugation, the pellet was resuspended in 2 liters 
of pH 8.0, 0.03 M Tris(hydroxymethyI)aminomethane-HCl (Tris-HCl) containing 1.0% TX-100 
and stirred overnight at 4°C to extract the UspA2 protein. Cells were pelleted by centrifugation 
at 13,700 x g for 30 min at 4°C. The supernatant, containing the UspA2 protein, was collected 
and further clarified by sequential microfiltration through a 0.8 jam membrane (CN.8, Nalge, 
10 Rochester, NY) then a 0.45 Ltm membrane (cellulose acetate, low protein binding, Corning, 
Corning, NY). 

The entire filtered crude extract preparation was loaded onto a 50 x 217 mm (-200 ml) 
TMAE column [650(S), 0.025-0.4 mm, EM Separations, Gibbstown, NJ] equilibrated with pH 
8.0, 0.03 M Tris-HCl buffer containing 0.1% TX-100 (THT). The column was washed with 

15 400 ml of equilibration buffer followed by 600 ml of 0.25 M NaCl in 0.03 M THT. UspA2 was 
subsequently eluted with 800 ml of 1 .0 M NaCl in 0.03 M THT. Fractions were screened for 
UspA2 by SDS-PAGE and pooled. Pooled fractions (-750 ml), containing UspA2, were 
concentrated approximately two-fold by ultrafiltration using an Amicon stirred cell (Amicon 
Corp., Beverly, MA) with a YM-100 membrane under nitrogen pressure. The TMAE 

20 concentrate was split into two 175 ml aliquots and each aliquot buffer exchanged by passage 
over a 50 x 280 mm (-550 ml) Sephadex G-25 (Coarse) column (Pharmacia Biotech, 
Piscataway, NJ) equilibrated with pH 7.0, 10 mM NaP0 4 containing 0.1% TX-100 (10 mM 
PT). The buffer exchanged material was subsequently loaded onto a 50 x 217 mm (-425 ml) 
ceramic hydroxy apatite column (Type I, 40 jam, Bio-Rad) equilibrated with 10 mM PT. The 

25 column was washed with 450 ml of the equilibration buffer followed by 900 ml of pH 7.0, 
0.1M NaP0 4 containing 0.1% TX-100. UspA2 was then eluted with a linear pH 7.0 NaP0 4 
concentration gradient between 0.1 and 0.2 M NaP0 4 containing 0.1% TX-100. An additional 
volume of pH 7.0, 0.2 M NaP0 4 containing 0.1% TX-100 was applied to the column and 
collected to maximize the recovery of UspA2. Fractions were screened for UspA2 by SDS- 

30 PAGE and pooled. The column was then washed with 900 ml of pH 7.0, 0.5 M NaPQ 4 
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containing 0.1% TX-100. The fractions from this wash were screened for UspAl by SDS- 
PAGE, pooled, and stored at 4°C. This pool was used for the purification of UspAl . 

Purification of UspAl . The UspAl enriched fractions collected during four separate 
purifications of UspA2 were pooled. The combined UspAl pools were concentrated 
5 approximately threefold by ultrafiltration using an Amicon stirred cell with a YM-100 
membrane under nitrogen pressure. The UspAl concentrate was split into two 175 ml aliquots 
and the buffer exchanged by passage over a 50 x 280 mm (-550 ml) Sephadex G-25 column 
equilibrated with 10 mM PT. The buffer exchanged material was subsequently loaded onto a 
50 x 217 mm (-425 ml) ceramic hydroxyapatite column (Bio-Rad) equilibrated with 10 mM 
10 PT. The column was washed with 450 ml of the equilibration buffer followed by 900 ml of pH 
7.0, 0.25 M NaP0 4 containing 0.1% TX-100. UspAl was subsequently eluted with a linear 
NaP0 4 gradient of pH 7.0, 0.25-0.5 M NaP0 4 containing 0.1% TX-100. The fractions 
containing UspAl were identified by SDS-PAGE and pooled. 

SDS-PAGE and Western blot Analysis. SDS-PAGE. was carried out as described by 
15 Laemmli (1970) using 4 to 20% (w/v) gradient acrylamide gels (Integrated Separation Systems 
(ISS), Natick, MA). Proteins were visualized by staining the gels with Coomassie Brilliant 
Blue R250. Gels were scanned using a Personal Densitometer SI (Molecular Dynamics Inc., 
Sunnyvale, CA) and molecular weights were estimated with the Fragment Analysis software 
(version 1.1) using the prestained molecular weight markers from ISS as standards. Transfer of 
20 proteins to polyvinylidene difluoride (PVDF) membranes was accomplished with a semi-dry 
electroblotter and electroblot buffers (ISS). The membranes were probed with protein specific 
antisera or MAb's followed by goat anti-mouse alkaline phosphatase conjugate as the secondary 
antibody (BioSource International, Camarillo, CA). Western blots were developed with the 
BCIP/NBT Phosphatase Substrate System (Kirkegaard and Perry Laboratories, Gaithersburg, 
25 MD). 

Protein Estimation. Protein concentrations were estimated by the BCA assay (Pierce, 
Rockford, IL), using bovine serum albumin as the smulurd. 
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Enzymatic and Chemical Cleavages oFUspA2 and UspAI. 

(i) CNBr Cleavage. Approximately 0.3 mg of the purified protein was precipitated with 
90% (v/v) ethanol and the pellet resuspended in 100 \il of 88% (v/v) formic acid containing 12 
M urea. Following resuspension, 100 j^l of 88% (v/v) formic acid containing 2 M CNBr 

5 (Sigma, St. Louis, MO) was added and the mixture incubated overnight at room temperature in 
the dark. 

(ii) Trypsin and Chvmotrvpsin Cleavage . Approximately 2 mg of the purified protein 
was precipitated with 90% (v/v) ethanol and the pellet resuspended in a total volume of 1 ml of 
phosphate-buffered saline (PBS) containing 0.1% TX-100. This preparation was added directly 

10 to a vial containing 25 jag of either trypsin or chymotrypsin (Boehringer Mannheim, 
Indianapolis, IN). The reaction mixture was incubated for 48 h at 37°C. 

(iiO Endoproteinase Lvs-C Cleavage . Approximately 2 mg of the purified protein was 
precipitated with 90% (v/v) ethanol and the pellet resuspended in a total volume of 1.0 ml of 
PBS containing 0.1% TX-100. This preparation was added directly to a vial containing 15 jag 
15 of endoproteinase Lys-C (Boehringer Mannheim). The reaction mixture was incubated for 48 
hat37°C. 

(iv) Separation of Peptides . The above cleavage reaction mixtures were centrifuged in 
an Eppendorf centrifuge at 12,000 rpm for 5 min and the supernatant loaded directly onto a 
Vydac Protein C4 HPLC column (The Separations Group, Hesperia, CA). The solvents used 

20 were 0.1 % (v/v) aqueous trifluoroacetic acid (TFA) [Solvent A] and acetonitrile:H 2 0:TFA, 
80:20:0.1 (v/v/v) [Solvent B] at a flow rate of 1.0 ml/min. Following the initial wash with 
Solvent A, the peptides were eluted with a linear gradient between 0 and 100% of Solvent B 
and detected by absorbance at 220 nm. Suitable fractions were collected, dried in a Speed-Vac 
concentrator (Jouan Inc., Winchester, VA) and resuspended in distilled water. The fractions 

25 were separated by SDS-PAGE in 10 to 18% (w/v. acrylamide) gradient gels (ISS) in a Tris- 
Tricine buffer system (Schagger and von Jagow. 1087). The fractions containing a single 
peptide band were submitted directly for N-terminal sequence analysis. Fractions displaying 
multiple peptide bands in SDS-PAGE were elccirophoretically transferred onto a PVDF 
membrane as described above. The membrane was stained with Coomassie Brilliant Blue R- 
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250 and the individual bands excised before submitting them for N-terminal sequence analysis 
(Matsudaira, 1987). 

Determination of subunit size. Determination of molecular weight by Matrix Assisted 
Laser Desorption/Ionization-Time of Flight (MALDI-TOF) mass spectrometry (Hillenkamp and 
Karas, 1990) was done on a Lasermat 2000 Mass Analyzer (Finnigan Mat, Heme! Hempstead, 
UK) with 3,5-dimethoxy-4-hydroxy-cinnamic acid as the matrix. Cold ethanol precipitation 
was done on samples containing >0.1% (v/v) TX-100 to remove the detergent. The final 
ethanol concentration was 90% (v/v). The precipitated protein was resuspended in water. 

Determination of aggregate sizes by gel filtration chromatography. Approximately 1 mg 
of the purified protein was precipitated with 90% (v/v) ethanol and the pellet resuspended in a 
total volume of 1.0 ml of PBS containing 0.1% TX-100. Two hundred microliters of the 
preparation were applied to a Superose-6 HR 10/30 gel filtration column (10 x 30 mm, 
Pharmacia) equilibrated in PBS /0.1% TX-100 at a flow rate of 0.5 ml/min. The column was 
calibrated using the HMW Calibration Kit (Pharmacia) which contains aldolase with a size of 
158,000, catalase with a size of 232,000; ferritin with a size of 440,000; thyrogiobulin with a 
size of 669,000; and blue dextran with sizes between 2000 and 2,000,000. 

Amino Acid Sequence Analysis. N-terminal sequence analysis was carried out using an 
Applied Biosystems Model 477A Protein/Peptide Sequencer equipped with an on-line Model 
120A PTH Analyzer (Applied Biosystems, Foster City, CA). The phenylthiohydantoin (PTH) 
derivatives were identified by reversed-phase HPLC using a Brownlee PTH C-18 column 
(particle size 5 jam, 2.1 mm i.d. x 22 cm 1.; Applied Biosystems). 

Immunizations. Female BALB/c mice (Taconic Farms, Germantown, NY), age 6-8 
weeks, were immunized subcutaneously with two doses of UspAl or UspA2 four weeks apart. 
To prepare the vaccine, purified UspAl or UspA2 was added to aluminum phosphate, and the 
mixture rotated overnight at 4°C. 3-O-deacylated monophosphoryl lipid A (MPL) (Ribi 
ImmunoChem Research, Inc.) was added just prior to administration. Each dose of vaccine 
contained 5 j-tg of purified protein, 100 jag of aluminum phosphate and 50 y,g of MPL 
resuspended in a 200 [il volume. Control mice were injected with 5 jj.g of CRM I97 with the 
same adjuvants. Serum samples were collected before the first vaccination and two weeks after 
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the second immunization. Mice were housed in a specific-pathogen free facility and provided 
water and food ad libitum. 

Monoclonal antibodies. The 17C7 MAb was secreted by a hybridoma (ATCC 
HB 11093). MAbs 13-1, 29-31, 45-2, and 6-3 were prepared as previously described (Chen et 
aL, 1995). 

Murine model of M catarrhal is pulmonary clearance. This model was performed as 
described previously (Chen et ai, 1995). 

Enzyme linked immunosorbent assay (ELISA) procedures. Two different ELISA 
procedures were used. One was used to examine the reactivity of sera to whole bacterial cells 
and the other the reactivity to the purified proteins. 

For the whole cell ELISA, the bacteria were grown overnight on Mueller-Hinton agar 
and swabbed off the plate into PBS. The turbidity of the cells was adjusted to 0.10 at 600 nm 
and 100 \A added to the wells of a 96 well Nunc F Immunoplate (Nunc, Roskilde, Denmark). 
The cells were dried overnight at 37°C, sealed with a mylar plate sealer and stored at 4°C until 
needed. On the day of the assay, the residual protein binding sites were blocked by adding 5% 
non-fat dry milk in PBS with 0,1% Tween 20 (Bovine Lacto Transfer Technique Optimizer 
[BLOTTO]) and incubating 37°C for one hour. The blocking solution was then removed and 
100 \x\ of sera serially diluted in the wells with blotto. The sera were allowed to incubate for 1 
h at 37°C. The plate wells were soaked with 300 ml PBS containing 0.1% Tween 20 for 30 
seconds and washed 3 times for 5 seconds with a Skatron plate washer and then incubated 1 lir 
at 37°C with goat anti-mouse IgG conjugated to alkaline phosphatase (BioSource) diluted 
1:1000 in blotto. After washing, the plates were developed at room temperature with 100 \x\ per 
well of 1 mg/ml p-nitrophenyl phosphate dissolved in diethanolamine buffer. Development was 
stopped by adding 50 \x\ of 3N NaOH to each well. The absorbance of each well was read at 
405 nm and titers calculated by linear regression. The titer was reported as the inverse of the 
dilution extrapolated to an absorption value of 0. 1 0 units. 

For the ELISA against the purified proteins, the proteins were diluted to a concentration 
of 5 p.g/ml in a 50 mM sodium carbonate buffer (pH 9.8) containing 0.02% sodium azide 
(Sigma Chemical Co.). One hundred microliters were added to each well of a 96 well 
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E.I.A./R.I.A medium binding ELISA plate (Costar Corp., Cambridge, iVIA) and incubated for 
16 hours at 4°C. The plates were washed and subsequently treated the same as described for 
whole cell ELISA procedure. 

Complement-dependent bactericidal assay. For this assay, 20 liI of the bacterial 
5 suspension containing approximately 1200 cfu bacteria in PBS supplemented with 0.1 mM 
CaCl 2 :, MgCl 2 and 0.1% gelatin (PCMG) were mixed with 20 liI of serum diluted in PCMG 
and incubated for 30 min at 4°C. Complement, prepared as previously described (Chen et al, 
1996), was added to a concentration of 20%, mixed, and incubated 30 min at 35°C. The assay 
was stopped by diluting with 200 liI of cold, 4°C, PCMG. 50 liI of this suspension was spread 
10 onto Mueller-Hinton plates. Relative killing was calculated as the percent reduction in cfu in 
the sample relative to that in a sample in which heat inactivated complement replaced active 
complement. 

Inhibition of bacterial adherence to HEp-2 cells. The effect of specific antibodies on 
bacterial adherence to HEp-2 cells was examined. A total of 5 x 10 4 HEp-2 cells in 300 |^1 of 

15 RPMI-10 were added to a sterile 8-well Lab-Tek chamber slide (Nunc, Inc., Naperville, 111) and 
incubated overnight in a 5% C0 2 incubator to obtain a monolayer of cells on the slide. The 
slide was washed with PBS and incubated with 300 [il of bacterial suspension (A 550 =0.5) or 
with a bacterial suspension that had been incubated with antisera (1:100) at 37°C for 1 h. The 
slides were then washed with PBS and stained with the Difco quick stain following the 

20 manufacturer's instructions. The slide was viewed and photographed using a light microscope 
equipped with a camera (Nikon Microphot-SA, Nikon, Tokyo, Japan). 

Protein interaction with fibronectin and vitronectin. The interactions of purified UspAl 
and UspA2 with fibronectin were examined by dot blot. Human plasma fibronectin (Sigma 
Chemical Co., St. Louis, MO) was applied to a nitrocellulose membrane, and the membrane 

25 blocked with blotto for 1 h at room temperature. The blot was then washed with PBS and 
incubated with purified UspAl or UspA2 (2 Lig/ml in blotto) overnight at 4°C. After three 
washes with PBS, the membrane was incubated with the MAb 17C7 diluted in blotto for 2 h at 
room temperature and then with goat anti-mouse immunoglobulin conjugated to alkaline 
phosphatase (BIO-RAD Lab. Hercules, Calif.) (1:2,000 in PBS with 5% dry milk, 2 h, room 

30 temperature). The membrane was finally developed with a substrate solution containing 
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nitroblue tetrazolium and 5-bromo-chloro-3-indolyl phosphate in 0.1 M tris-HCl buffer (pH 
9.8). 

Interaction with vitronectin was examined by a similar procedure. The purified UspAl 
and UspA2 were spotted onto the nitrocellulose membrane and the membrane blocked with 
blotto. The membrane was then incubated sequentially with human plasma vitronectin (GIBCO 
BRL ? Grand Island, N.Y., 1 |ug/ml in blotto), rabbit anti-human vitronectin serum (GIBCO 
BRL), goat anti-rabbit IgG-alkaline phosphatase conjugate and substrate. 

Interaction with HEp-2 cells by the purified protein. Each well of a 96 well cell culture 
plate (Costar Corp., Cambridge, Mass.) was seeded with 5 x 10 4 HEp-2 cells in 0.2 ml RPMI 
containing 10% fetal calf serum and the plate incubated overnight in a 37°C incubator 
containing 5% C0 2 . Purified UspAl or UspA2 (1 to 1,000 ng) in blotto was added and 
incubated at 37°C for 2 h. The plate was washed with PBS, and incubated with the 1:1 mixed 
mouse antisera to either UspAl or UspA2 (1 : 1000 dilution in PBS containing 5% dry milk), the 
plate was washed and incubated with rabbit anti-mouse IgG conjugated to horseradish 
peroxidase (1:5,000 in PBS containing 5% dry milk) (Brookwood Biomedical, Birmingham, 
AL) at room temperature for 1 h. Finally, the plate was washed and developed with a substrate 
solution containing 2,2'-azino-bis-(3-ethyl-benzthiazoline~6-sulfonic acid) at 0.3 mg/ml in pH 
4.0 citrate buffer containing 0.03% hydrogen peroxide (KPL, Gaithersburg, MD). Whole 
bacteria of strain 035E were included as a positive control. The highest concentration of the 
bacteria tested had an optical density of A 530 =1.0. The abscissa for the bacterial data shown in 
FIG. 7 plots the values for three fold dilutions of the bacterial suspension. 



Purification of UspAl and UspA2. The inventors developed a large-scale, high yield 
process for extracting and purifying UspA2 from a pellet of M catarrhalis cells. The method 
consisted of three critical steps. First the UspA2 protein was extracted from the bacteria with 
pH 8.0, 0.03 M THT. Second, the cell extract was applied to a TMAE column and the UspA2 
protein eluted with NaCl. Finally, the enriched fractions from the TMAE chromatography were 
applied to a ceramic hydroxyapatite column and the UspA2 eluted with a linear NaP0 4 
gradient. A yield of 250 mg of purified UspA2 was typically obtained from -400 g wet weight 
of M. catarrhalis 035E strain cells. A single band was seen for the UspA2 in SDS-PAGE gels 



Results 
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by Coomassie blue staining. It corresponded to a molecular size of -240,000 and contained 
greater than 95% of the protein based on scanning densitometry (FIG. 6A). A second band 
reacting with the 17C7 MAb at approximately 125,000 could be detected in the UspA2 
preparation by western but not by Coomassie blue staining (FIG. 6C). The cells need not be 
5 lysed to achieve this high yield, which suggested this protein is present in large amounts on the 
surface of the bacterium. 

A method for the purification of the UspAl protein was also developed. This protein 
co-purified with UspA2 through the initial extraction and TMAE chromatography steps. 
Following hydroxyapatite chromatography, however, UspAl remained bound to the column 

10 and had to be eluted at the higher salt concentration of 500 mM NaP0 4 . The crude UspAl 
preparation obtained in this step was reapplied and eluted from the hydroxyapatite column 
using a linear sodium phosphate gradient. A total of 80 mg of purified UspAl was isolated 
from -1.6 kg wet wt. of M catarrhalis 035E strain cells. UspAl purified using this method 
migrated at three different apparent sizes on SDS-PAGE depending on the method of sample 

15 preparation. Unheated samples exhibited a single band at -280,000, whereas samples heated at 
100°C for 3 min resulted in an apparent molecular weight shift to -350,000. Prolonged heating 
at 100°C resulted in a shift of the 350,000 band to one at 100,000 (FIG. 6B). Following heating 
of the sample for 7 min at 100°C, the band at 100,000 contained greater than 95% of the protein 
based on scanning densitometry of the Coomassie stained gel. In contrast, UspA2 migrated at 

20 240,000 regardless of the duration of the heating when examined by SDS-PAGE. The different 
migration behaviors indicated the preparations contained two distinctly different proteins 

Molecular Weight Determinations. MALDI-TOF mass spectrometric analysis for 
determination of molecular weight of UspA2 using 3,5-dimethoxy-4-hydroxy-cinnamic acid 
matrix in presence of 70% (v/v) aqueous acetonitrile and 0.1% TFA resulted in the 
25 identification of a predominant species with average molecular mass of 59,5 18 Da. In addition 
to the expected [M+H] + and [M+2H] 2 *" molecular ions, the [2M+H] + and [3M+H] * ions were 
also observed. The latter two ions were consistent with the dimer and the trimer species. Using 
similar conditions, the inventors were unable to determine the mass of UspAl . 

To determine the molecular sizes of the purified proteins in solution, UspAl and UspA2 
30 were independently run on a Superose-6 HR 10/30 gel filtration column (optimal separation 
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range: 5,000-5,000,000) calibrated with molecular weight standards. Purified UspAl exhibited 
a native molecular size of 1,150,000 and UspA2 a molecular size of 830,000. These sizes, 
however, may be affected by the presence of TX-100. 

N-terminal Sequence Analysis of Internal UspAl and UspA2 Peptides. All attempts to 
determine the N-terminal sequences of both UspA and UspAl proved unsuccessful. No 
sequence could be determined. This suggested two things. First, the N-terminus of both 
proteins were blocked, and, second, neither protein preparation contained contaminating 
proteins that were not N-terminal ly blocked. 

Thus, to confirm that the primary sequence of purified UspAl and UspA2 corresponded to that 
deduced from their respective gene sequences, internal peptide fragments were generated and 
subjected to N-terminal sequence analysis. Tables X and XI show the N-terminal sequences 
obtained for fragments generated from the digestion of the UspA2 and UspAl proteins, 
respectively. The sequences matching the primary amino acid sequence deduced from the 
respective gene sequences are indicated for each fragment. The UspA2 fragments #3 and #4 
exhibited sequence similarity with residues 505-515 and 605-614 respectively of the amino acid 
sequence deduced from the UspAl gene. In Table XII, UspAl fragment #3 exhibited sequence 
similarity with residues #278-294 of the UspA2 primary sequence. These sequences 
corresponded with the domains within UspAl and UspA2 that share 93% sequence identity. 
The remainder of the sequences, however, were unique to the respective proteins. 

TABLE X 



N-terminal sequences of internal UspA2 peptide cleavage fragments 



UspA2 Fragment Sequence" 


Match" 


Cleavage 


1 ) LLAEQQLNG SEQ ID NO:73 


92-100 


Trypsin 


2) ALESNVEEGL SEQ ID NO:74 


216-225 


Lys-C 




245-254 






274-283 




3) ALESNVEEGLLDLS SEQ ID NO:75 


274-288 


Trypsin 




♦505-515 
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TABLE X cont'd 



UspA2 Fragment Sequence 4 


iviatcn 


Cleavage 


4) AKASAANTDR SEQ ID NO:76 


J / O-jo 1 


v^ny muiry pMii 




* 605-614 




5) AATAADAITKNGN SEQ ID NO:77 


439-450 


Chymotrypsin 


6) SITDLGTKVDGFDGR SEQ ID NO:78 


458-472 


Lys-C 


7) VDALXTKVNALDXKVN SEQ ID NO:79 


473-488 


Trypsin 


8) AAQAALSGLFVPYSVGKFNATAALGGYGSK 


506-535 


CNBr 


SEQ ID NO:80 






"Underlined residues denote mismatch with the nucleotide derived amino 


acid sequence. 


Ambiguous residues whose identity could not be verified are denoted by the letter X. 


b Asterisk (*) indicates match with UspAl. Without asterisk 


indicates matches with nucleotide 


derived amino acid sequence of UspA2. 






TABLE XI 






N-terminal sequences of internal UspAl peptide cleavage fragments 


UspAl Fragment Sequence' 1 


Match" 


Cleavage 


1)LENNVEE£XLNLS 


456-468 


Lys-C 


2) DQKADI 


473-478 


Trypsin 


3) NNVEEGLLDLSGRLIDQK 


504-521 


Lys-C 




* 278-294 




4) VAEGFEIF 


690-697 


Trypsin 


5) AGIATNKQELILQNDRLNRI 


701-720 


Lys-C 



a As per Table X. X denotes an unidentified amino acid residue. 



b Asterisk (*) indicates match with UspA2. Without asterisk indicates matches with nucleotide 
1 0 derived amino acid sequence of UspA 1 . 

Reactivity of MAbs with UspAl and UspA2. The western blot analysis of purified 
UspAl and UspA2 revealed that both proteins reacted strongly with the MAb 17C7 described 
by Helminen et aL (1994) (FIG. 7). The reactivity of the proteins with other MAbs was also 
15 investigated. The data in Table XII show that, whether assayed by ELISA or western, the 
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MAbs 13-1, 29-31 and 45-2 only reacted with UspA2, the MAbs 7D7 ? 29C6, 1 1A6 and 12D5 
only reacted with UspAl, while 17C7 and 6-3 reacted with both UspAl and UspA2. All the 
MAbs shown in Table XIII bind to whole bacteria when examined by ELISA. These results 
indicated that UspA2 was exposed on the surface of the bacterium. 



TABLE XII 

Summary of reactivity of monoclonal antibodies with purified UspAl, 
UspA2 and whole bacteria of strain 035E 

Reactivity 



mAb 


Isotype 


Whole 
bacterium 5 * 


Purified 
UspAl b 


Purified UspA2° 


13-1 


IgGlK 


+ 






29-31 


IgGR 






+ 


45-2 


IgG2a 


+ 






17C7 


IgG2a 


+ 


+ 


+ 


6-3 


IgM 


+ 


+ 


+ 


7D7 


IgG2b 


+ 


+ 




29C6 


IgGl 








11A6 


IgA 


+ 


+ 




12D5 


IgGl 


+ 


+ 





a Determined by whole cell ELISA. 
b Determined by ELISA and western blot. 
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TABLE XIII 



Cross-reactivity of antibodies to UspAl and UspA2 proteins 



Antiserum to 


Geometric mean 


ELISA titer" to 




UspAl 


UspA2 


UspAl a 


740,642 c 


10,748" 


UspA2 a 


19,120 d 


37,61 5 d 



a The preparation of the sera are described in the text. 

b ELISA titers are for total IgG and IgM antibodies for sera pooled from ten mice. 

c The difference in titer of the anti-UspAl with the two purified proteins was statistically 

different by the Wilcoxon signed rank test (p=0.0002). 
d The difference in titer of the anti-UspA2 with the two purified proteins was statistically 

different by the Wilcoxon signed rank test (p=0.Q\ ). 

Immunogenicity and antibody cross-reactivity. Antisera to the purified UspAl and 
UspA2 proteins were generated in mice. The titers of antigen specific antibodies (IgG and IgM) 
as well as the cross-reactive antibodies in these sera were determined by an ELISA assay using 
each of the purified proteins (Table XIII). Both proteins elicited antibody titers that were 
greater against themselves than against the heterologous protein. Thus, the reactivities of both 
the MAbs (Table XII) as well as the polyclonal antibodies indicate that the proteins possessed 
both shared and non-shared B-cell epitopes. 

Antibody reactivity to whole bacterial cells and bactericidal activity. Antisera to the 
UspAl and UspA2 were assayed by whole cell ELISA against the homologous 035E strain and 
several heterologous isolates (Table XIV). The antibodies to UspAl and to UspA2 reacted 
strongest with the 035E strain. The reactivity of the sera toward the heterologous isolates 
indicated they bound antibodies elicited by both UspAl and UspA2. 
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TABLE XIV 

ELISA and complement mediated bactericidal titers toward whole bacterial cells of 
multiple isolates of M. catarrhalis elicited by purified UspAl and purified UspA2 

Whole cell ELISA - Bactericidal titer 2 



Isolate anti-UspAl a anti-UspA2 a anti-UspAl anti-UspA2 

035E 195,261 133,492 400 800~~ 

430-345 12,693 18,217 400 400 

1230-359 7,873 13,772 400 400 

TTA24 14,341 7,770 800 800 



a Titer determined for pool of sera from ten mice. The titer of the sera drawn before the first 

immunization was less than 50 for all isolates. 
b Bactericidal titers were determined as the inverse of the highest serum dilution killing greater 

than 50% of the bacteria. The titers for the sera from mice immunized contemporaneously 

with CRM 197 were less than 100. 

The bactericidal activities of the antisera to UspAl and UspA2 were determined against 
035E and other isolates as well (Table XIV). Both sera had bactericidal titers ranging from 
400-800 against 035E and the disease isolates. Anti-CRM l97 serum, the negative control, as 
well as sera drawn before immunization, had a titers of <100 against all the strains. These 
results were consistent with the previous observation that the epitopes shared by the two 
proteins are highly conserved among isolates and the antibodies toward those isolates are 
bactericidal. 

Pulmonary challenge. Immunized mice were given a pulmonary challenge with the 
homologous 035E strain or the heterologous TTA24 strain. Relative to the control mice 
immunized with CRM, 97) enhanced clearance of both strains was observed regardless of 
whether the mice were immunized with UspAl or UspA2 (Table XV). No statistical difference 
(p> 0.05) was seen between the groups of mice immunized with UspAl and with UspA2. 
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TABLE XV 



Pulmonary clearance of M. catarrhalis by mice immunized with 
purified UspAl and UspA2 



Study 


Immunogen 


Challenge strain 


% clearance" 


P 


1 


UspAl 


035E 


49.0 


0.013 




UspA2 




31.8 


0.05 




CRM |9 7 




0 




2 


UspAl 


TTA24 


54.6 


0.02 




UspA2 




66.6 


0.0003 




CRM, 97 




0 





Challenge method described in text. Numbers are the percentage of bacteria cleared from the 
5 immunized mice compared to control mice which were immunized with CRM| 97 . 

Interaction of purified proteins with HEp-2 cells. The purified UspA 1 and UspA2 were 
tested for their ability to interact with HEp-2 cell monolayer in a 96-well plate using an ELISA. 
Protein binding to the HEp-2 cells was detected with a 1:1 mix of the mouse antisera to UspAl 

10 and UspA2. Purified UspAl bound to HEp-2 cells at concentrations above 10 ng. A weak 
binding by the UspA2 was detected at concentrations above 100 ng (FIG. 7). The attachment of 
035E bacteria to HEp-2 cells was used as a positive control. This result, plus the data showing 
that the anti-UspAl antibodies inhibited attachment of the bacteria to HEp2 cells, suggests 
UspAl plays an important role in bacterial attachment which also suggested that UspAl was 

1 5 exposed on the bacterial surface. 

Interaction of purified proteins with fibronectin and vitronectin. The purified proteins 
were assayed for their ability to interact with fibronectin and vitronectin by dot blot assays. 
Human plasma fibronectin immobilized on a nitrocellulose membrane bound purified UspAl 
but not UspA2 (FIG. 8), while UspA2 immobilized on the nitrocellulose membrane was capable 
20 of binding vitronectin (FIG. 8). Vitronectin binding by the UspAl was also detected, but the 
reactivity was weaker. Collagen (type IV), porcine mucin (type III), fetuin and heparin were 
also tested for interaction with purified UspAl and purified UspA2, but these did not exhibit 
detectable binding. 
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Discussion 

Previous UspA purification attempts yielded preparations containing multiple high 
molecular weight protein bands by SDS-PAGE and western blot. Because each of the bands 
reacted with the "UspA specific' 1 MAb 17C7, it was thought they represented multiple forms of 
the UspA protein (Chen et aL, 1996). However, the inventors have discovered that there are 
two distinct proteins, UspAl and UspA2, that share an epitope recognized by the 17C7 MAb. 
These two proteins are encoded by different genes. This study shows that UspAl and UspA2 
can be separated from one another. The isolated proteins had different SDS-PAGE mobility 
characteristics, different reactivity with a set of monoclonal antibodies, and different internal 
peptide sequences. The results, however, were consistent with the proteins sharing a portion of 
their peptide sequences, including the MAb 17C7 epitope. The separation of the proteins from 
one another has allowed the inventors to further demonstrate how the proteins were different as 
well as examine their biochemical, functional, and immunological characteristics. 

In solution, the purified proteins appear to be homopolymers of their respective subunits 
held together by strong non-covalent forces. This is indicated by the fact that UspA2 lacks any 
cysteines and treatment of both proteins with reducing agents did not alter their mobilities in 
SDS-PAGE. Both gene sequences possess leucine zipper motifs that might mediate coil-coil 
interactions (O'Shea et ai 9 1991). Even so, it was surprising that the non-covalent bonds of 
both proteins were not only strong enough to resist dissociation by the conditions normally used 
to prepare samples for SDS-PAGE, but also high concentrations of chaotropic agents such as 
urea (Klingman and Murphy, 1994) and guanidine HCL Of the two proteins, UspA2 appeared 
to be less tightly aggregated, this was indicated by the fact that its subunit size of 59,500 Da 
could be determined by mass spectrometry. UspAl, however, was recalcitrant to dissociation 
by all the methods tried, and this may be the reason its size could not be determined by mass 
spectrometry. In SDS-PAGE, the dominant UspA2 migrated with an apparent size of 240,000 
while a far smaller portion migrated at about 125,000 and could only be detected by western 
analysis. The mobility of UspAl, however, varied depending on how long the sample was 
heated. The smallest form was about 100,000. This was consistent with the size of the gene 
product missing from the uspAl mutant but not with the size predicted from the gene sequence 
of 88,000 Da. In solution, both proteins formed larger aggregates than those seen by SDS- 
PAGE. Their sizes, as measured by gel filtration chromatography, were 1,150,000 and 830,000 
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for UspAl and UspA2 respectively. If the proteins behave this way in vivo, UspAl and UspA2 
likely occur as large molecular complexes on the bacterial surface of the bacterium. 

The results of the N-terminal amino acid sequence analyses of the UspA2 and UspAl 
derived peptides (Tables X and XI) were in agreement with the protein sequences derived from 
the respective gene sequences. This confirmed that the purified UspAl and UspA2 proteins 
were the products of the respective itspAl and uspA2 genes. Further, the experimental and 
theoretical amino acid compositions of UspAl and UspA2 were consistent, given the size of the 
proteins and the accuracy of the amino acid determination. There was, however, a discrepancy 
between the size determined by mass spectrometry of 59,518 and the size indicated from the 
gene sequence for UspA2 of 62,483. This discrepancy suggested that this protein either 
undergoes post-translational processing or proteolytic degradation. 

The data also suggest that both proteins are exposed on the bacterial surface. That at 
least one of the proteins is exposed is evident from the finding that the MAb 17C7 and 
polyclonal sera react with whole cells. The reactivities of the UspA2 specific monoclonal 
antibodies 13-1, 29-31 and 45-2 with the bacterial cells in the whole cell ELISA provided 
evidence that the UspA2 is a surface protein (Table XII). The reactivities of the UspAl specific 
MAbs 7D7, 29C6, 1 1 A6 and 12D5 with the bacterial cells in the whole cell ELISA provided 
evidence that the UspAl is a surface protein (Table XII). Further evidence for surface exposure 
of UspAl was indicated by the inhibitory effect of the antiserum on bacterial attachment to 
HEp-2 cells. The sera to the UspA2 lacked this activity. Thus, both UspAl and UspA2 
appeared to be surface exposed on the bacterium. 

Surface exposure of the proteins is probably important for the two proteins' functions. 
One function for UspAl appears to be meditation of adherence to host tissues. The evidence for 
this was that UspAl antibodies inhibited bacterial binding to HEp-2 cells and the purified 
protein itself bound to the cells. The relevance of binding to HEp-2 cells is that they are 
epithelial cells derived from the larynx, a common site of M catarrhalis colonization (Schalen 
et al. y 1992). This confirms the inventors findings that mutants that do not express UspAl fail 
to bind epithelial cells. The inventors' also showed that UspAl binds fibronectin. Fibronectin 
has been reported to be a host receptor for other pathogens (Ljungh and Wadstrom, 1995; 
Westerlund and Korhonen, 1993). Examination of the gene sequence, however, failed to reveal 
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any similarity with the fibronectin binding motifs reported for Gram positive organisms 
(Westerlund and Korhonen, 1993). Thus, it is fairly clear that UspAl plays a role in host 
adherence, possibly via cell associated fibronectin. 

The function of UspA2 is less certain. Antibodies toward it did not block adherence to 
the HEp-2 or Chang cell lines, nor did the purified protein bind to those cells. Yet, UspA2 
bound vitronectin strongly. Pathogen binding of vitronectin has been linked to host cell 
adherence (Gomez-Duarte et aL, 1997; Limper et al, 1993); however, van Dijk and his co- 
workers have reported that vitronectin binding by M catarrhalis may be used by the bacteria to 
subvert host defenses (Verdiun et aL, 1994). The soluble form of vitronectin, known as 
complement factor S, regulates formation of the membrane attack complex (Su, 1996). They 
suggest that the binding of vitronectin to the M. catarrhalis surface inhibits the formation of the 
membrane attack complex, rendering the bacteria resistant to the complement dependent killing 
activity of the sera. They have also described two types of human isolates: one that binds 
vitronectin and is resistant to the lytic activity of the serum and the other that does not bind 
vitronectin and is serum sensitive (Hoi et aL, 1993). It must be noted, however, that 
vitronectin, like all the extracellular matrix proteins, has many forms and serves multiple 
functions in the host (Preissner, 1991; Seiffert, 1997). Thus, the interaction of both UspAl and 
UspA2 with the extracellular matrix proteins fibronectin and vitronectin may serve the 
bacterium in ways beyond subverting host defenses or as receptors for bacterial adhesion. 

Even though the two proteins share epitopes and sequences, they have different 
biochemical activities and likely serve different biological functions. If an immune response to 
the respective protein interferes with its function, it ought to be considered as a vaccine 
candidate. The results of the immunological studies in mice indicated that both proteins would 
be good vaccine candidates. Mice immunized with either UspAl or UspA2 developed high 
antibody titers toward the homologous and heterologous bacterial isolates. Further, the sera 
from these mice had complement dependent bactericidal activity toward all the isolates tested. 
In addition, immunized mice exhibited enhanced pulmonary clearance of the homologous 
isolate and heterologous isolates. It is important to note that antibodies elicited by the proteins 
were partially cross-reactive. This was expected since both react with the I7C7 MAb and share 
amino acid sequence. 
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EXAMPLE V: The Level and Bactericidal Capacity of Child and Adult Human 
Antibodies Directed against the Proteins UspAl and UspA2 

To determine if humans have naturally acquired antibodies to the UspAl and UspA2 of 
the A-/, catarrhalis and the biological activity of these antibodies if present, sera from healthy 
humans of various ages was examined using both ELISA and a bactericidal assay. It was found 
that healthy people have naturally acquired antibodies to both UspAl and UspA2 in their sera, 
and the level of these antibodies and their bactericidal capacity were age-dependent. These 
results also indicate that naturally acquired antibodies to UspAl and UspA2 are biologically 
functional, and thus support their use as vaccine candidates to prevent M catarrhalis disease. 



Material and methods 

Bacteria. The M catarrhalis strains 035E and TTA24 were as described in Example 1. 
An ATCC strain (ATCC 25238) and three other clinical isolates from the inventors' collection 
were also used. 

Human sera. Fifty-eight serum samples were collected from a group often children at 2, 
4, 6, 7, 15 and 18 months of age who had received routine childhood immunizations. 
Individual sera from twenty-six adults and fifteen additional children 18-36 months of age were 
also assayed. All sera were obtained from clinically healthy individuals. Information on M. 
catarrhalis colonization and infection of these subjects was not collected. The sera were stored 
at -70°C. 

Purification of UsnAl and UspA2. Purified UspAl and UspA2 were made from the 
035E strain of M. catarrhalis as described in Example IV herein. Each protein preparation 
contained greater than 95% of the specific protein based on densitometric scanning of 
Coomassie brilliant blue stained SDS-PAGE. Based on western blot analysis using monoclonal 
antibodies, each purified protein contained no detectable contamination of the other. 

Purification of UspAI and UsdA2 specif ..' nntihodies from human plasma. Human 
plasmas from two healthy adults were obtained from the American Red Cross (Rochester, N.Y.) 
and pooled. The antibodies were precipitated by adding ammonium sulfate to 50% saturation. 
The precipitate was collected by centrifugation and dialyzed against PBS. A nitrocellulose 
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membrane (2x3 inches) was incubated with UspAl or UspA2 at 0.5 mg/ml in PBS containing 
0.1% (vol/vol) Triton X-100 for 1 h at room temperature, washed twice with PBS and residual 
binding sites on the membrane blocked with 5% (wt/vol) dry milk in PBS for 2 h at room 
temperature. The membrane was then sequentially washed twice with PBS, 100 mtM glycine 
5 (pH 2.5) and finally with PBS before incubation with the dialyzed antibody preparation. After 
incubating for 4 h at 4°C, the membrane was washed again with PBS, and then 10 mM Tris 
buffer (pH 8.0) containing 1 M sodium chloride to remove non-specific proteins. The bound 
antibodies were eluted by incubation in 5 ml of 100 mM glycine (pH 2.5) for 2 min with 
shaking. One ml of Tris-HCl (1M, pH 8.0) was immediately added to the eluate to neutralize 
10 the pH. The eluted antibodies were dialyzed against PBS and stored at -20°C. 

Enzyme-linked immunosorbent assav (ELISA). Antibody titers to the 035E and other 
M catarrhalis strains were determined by a whole-cell ELISA as previously described using 
biotin-labeled rabbit anti-human IgG or IgA antibodies (Brookwood Biomedical, Birmingham. 
Alabama) (Chen ei ai, 1996). Antibody titers to UspAl and UspA2 were determined by a 

15 similar method except that the plates were coated with 0.1 \xg of purified protein in 100 jag of 
PBS per well overnight at room temperature. The IgG subclass antibodies to UspAl or UspA2 
were determined using sheep anti-human IgG subclass antibodies conjugated to alkaline 
phosphatase (The Binding Site Ltd., San Diego, Calif.). The antibody end point titer was 
defined as the highest serum dilution giving an A 41S greater than three times that of the control. 

20 The control wells received all treatments except human sera and usually had absorbance values 
ranging from 0.03 to 0.06. 

The specificity of biotin-labeled rabbit anti-human IgG and IgA antibodies was 
determined against purified human IgG, IgM and IgA (Pierce, Rockford, IL) by ELISA. No 
cross-reactivity was found. The assay sensitivity determined by testing against purified human 

25 antibodies of appropriate isotype in an ELISA was 15 and 60 ng/ml in the IgG and IgA assays, 
respectively. Likewise, the specificity of the human IgG subclass antibody assays was 
confirmed in ELISA against purified human myeloma IgG subclass proteins (ICN Biomedicals, 
Inc., Irvine, CA), and the assay sensitivity was 15 ng/ml in the IgGL lgG3 and IgG4 assays, 
and 120 ng/ml in the IgG2 assay. Two control sera were included to control for assay to assay 

30 variation. 
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Complement dependent bactericidal assav. The bactericidal activity of the human sera 
was determined as described previously (Chen ef ai, 1996). In some studies, the sera were 
absorbed with purified UspAl or UspA2 prior to the assay. The absorption of specific 
antibodies from these sera was accomplished by adding the purified proteins to 20 or 50 j-ig/ml 
5 final concentration. The final serum dilution was 1:10. The mixtures were incubated for 2 h at 
4°C and the precipitate removed by micro-centrifugation. The purified human antibodies 
specific for UspAl and UspA2 were assayed against five M. caiarrhalis strains in a similar 
manner. 

Statistics. Statistical analysis was performed on logarithmic transformed titers using 
10 JMP software (SAS institute, Cary, N.C.). To allow transformation, a value of one half the 
lowest serum dilution was assigned to sera which contained no detectable titers. Comparison of 
IgG levels among the age groups was done by analysis of variance, and the relationship of 
antibody titer and the bactericidal titer was determined by logistic regression. A p value less 
than 0.05 was considered significant. 

15 Results 

Comparison of serum IgG and IoA titers to UspAl and UspA2 in children and adults. 
The IgG and IgA antibody titers in the sera from ten children collected longitudinally between 
2-18 months of age, as well as the random samples from fifteen 18-36 month old children and 
twenty-six adults were determined against the whole bacterial cells of the 035E strain, the 

20 purified UspAl and the purified UspA2 by ELISA. IgG titers to all three antigens were 
detected in almost all the sera (FIG. 9). The IgG titers to UspAl and UspA2 exhibited strong 
age-dependent variation when compared to IgG titers to the 035E bacterium (FIG. 9). The 
adult sera had significantly higher IgG titers to the purified proteins than sera from children of 
various age groups(p<0.01). Sera from children at 6-7 months of age had the lowest IgG titers 

25 to UspA proteins and the mean titer at this age was significantly lower than that at 2 months .of 
age (p<0.05). 

The level- of IgA antibodies to UspAl, UspA2 and 035E bacterial cells were age 
dependent (FIG. 9). A serum IgA titer against the UspAl atid UspA2 was detected in all 
twenty-six adults and children of 18-36 months of age. For children less than 18 months of age, 
30 the proportion exhibiting antigen specific IgA titers increased with age. The mean IgA titers to 
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UspAl. UspA2 or 035E bacterium in these sera were low for the first 7 months of age but 
gradually increased thereafter (FIG. 9). 

Age-dependent subclass distribution of IuG antibodies to UspAl and UspA2. The IgG 
subclass titers to the UspAl and UspA2 antigens were determined on sera from ten adult sera 
5 and thirty-five children's sera. The subclass distribution was found to be age-dependent. The 
most prominent antibodies to the UspAl and UspA2 antigens were of the IgGl and IgG3 
subclasses, which were detected in almost all sera. The IgG2 and IgG4 titers were either 
undetectable or extremely low. Therefore, only data on IgGl and IgG3 subclasses are reported 
(FIG. 10). The IgG3 titers against UspAl or UspA2 in the adult sera were significantly higher 

10 than the IgGl titers (/?<0.05). The same subclass profile was seen in the sera from the 2 month 
old children, although the difference between IgGl and IgG3 titers did not reach statistical 
significance, probably because of the smaller sample size. Sera from children between 4 and 36 
months of age all had a similar subclass profile which was different from that of the adults and 
2 month old children. The IgGl titers in children's sera were either higher than or equivalent to 

15 the IgG3 titers. The mean IgGl titer to either UspAl or UspA2 was significantly higher than 
IgG3 titer to the same antigens in these children's sera (/?<0.05). 

Bactericidal activity. The bactericidal titers of seventeen sera representing different age 
groups were determined (Table XVI). All the adult sera and three out of five sera from the two 
month old children which had high IgG titers to the UspA proteins had strong bactericidal 

20 activity. Sera from 6 month old children had the least bactericidal activity. All five sera from 
this age group had a marginal bactericidal titer of 50, the lowest dilution assayed. The 
bactericidal activity of the sera from 1 8 to 36 month old children was highly variable with titers 
ranging from less than 50 to 500. There was a significant linear relationship between the 
bactericidal titers and the IgG antibody titers against both UspAl and UspA2 by logistic 

25 regression analysis (p<0.01) (FIG. 1 1). 
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TABLE XVI 

The level of IgG antibodies to UspAl and UspA2 from normal human serum 

and the serum bactericidal activity 
Subject 1 Age ELISA IgG titer" BC titer c 







UspAl 


UspA2 




1 


2 month 


17,127 


6,268 


500 




6 month 


4.273 


1,363 


50 




1 5 month 


798 


250 


<50 


2 


2 month 


12,078 


12,244 


500 




6 month 


1,357 


878 


50 




1 8 month 


14,041 


14,488 


200 


"> 


2 month 


30,283 


20.362 


500 




6 month 


1,077 


1,947 


50 




1 8 month 


2,478 


1,475 


<50 


4 


2 month 


2,086 


869 


<50 




6 month 


530 


802 


50 




1 8 month 


9,767 


8,591 


200 




9 mnnth 

11 IVJ 1 1 Li 1 


3,233 


2,655 


<50 




6 month 


2,246 


360 


50 




1 8 month 


26,693 


43,703 


500 


6 


1.5-3 year 


4,036 


2,686 


50 


7 


1.5-3 year 


2,037 


1,251 


50 


8 


1.5-3 year 


341 


251 


<50 


9 


1.5-3 year 


2,538 


1,200 


500 


10 


1.5-3 year 


1078 


1,370 


500 


11 


. 1.5-3 year 


1,265 


953 


50 
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TABLE XVI (Continued) 



Subject' 1 


Age 


ELISA IgG titer" 


BC titer 






UspAl 


UspA2 




12 


adult 


161,750 


87,180 


450 


13 


adult 


873,680 


248,290 


>1350 


14 


adult 


154,650 


146,900 


450 


15 


adult 


10,330 


7,860 


50 


16 


adult 


35,780 


31,230 


150 


17 


adult 


19,130 


132,200 


450 



a Three consecutive samples from subjects 1 through 5 were collected at the stated ages. 



b ELISA end point titers to purified UspAl or UspA2 from the 035E strain were determined 
as the highest serum dilution giving an A 415 greater than three times the background. 
5 C BC titers: bactericidal titer assayed against the 035E strain. Sera were assayed at 1:50, 

100, 200, and 500. Bactericidal titer was determined as the highest serum dilution resulting in 
killing of 50% or more of the bacteria relative to the control. Control bacteria were incubated 
with test serum and heat inactivated complement serum. 

10 Bactericidal activity of sera absorbed with purified UspAl or UspA2. Because normal 

human sera contain antibodies to numerous antigens of M. catarrhalis as indicated by western 
blot ? an absorption method was used to determine the contribution of UspAl and UspA2 
specific antibodies towards the bactericidal activity. Six adult sera were absorbed with purified 
UspAl or UspA2, and the change in ELISA reactivity to UspA proteins determined. A 

15 reduction in ELISA reactivity was seen for all the sera after absorption (Table XVII). Further, 
absorption with one protein resulted in a reduction of IgG titers to the other protein. Reduction 
of UspA2 reactivity was of the same degree regardless of whether the absorbent was UspAl or 
UspA2. In contrast, there was less reduction in UspAl reactivity after absorption with UspA2 
than with UspAl (Table XVII). This indicated thai antibodies to UspAl and UspA2 were 

20 partially cross-reactive. 
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TABLE XVII 



ELISA titer of adult sera before and after absorption' 1 



Absorbent 




IgG titers to UspAl in sum 


pie 






#1 


#2 


#3 


#4 


#5 


#6 


saline 


161,750 


873,680 


154,650 


10,330 


35,780 


19,130 


UspAl 


2,450 


2,210 


3,160 


1,650 


<500 


3,010 


UspA2 


42,620 


90,150 


33,570 


6,420 


3,490 


4, 130 








IgG titers to UspA2 b 






saline 


87,180 


248,290 


146,900 


7,860 


31,230 


13,200 


UspAl 


2,800 


2,120 


2,700 


2,220 


<500 


<500 


UspA2 


<500 


1,820 


3,010 


2,960 


<500 


<500 



a Absorption: An aliquot of adult serum was diluted and added with purified UspAl or UspA2 
from 035E strain to a final 50 Ltg/ml protein concentration and final 1:10 serum dilution. The 



5 mixtures were incubated at 4°C for 2 h, and precipitates removed by microcentrifugation. 

b IgG titers against the UspAl and UspA2 proteins were end point titers determined with a 
starting serum dilution of 1:500. 

The bactericidal titers of the absorbed sera were determined and compared with those 
10 seen before absorption (Table XVIII). Absorption with either UspAl or UspA2 resulted in 
complete loss of bactericidal activity (<50) for all six sera when assayed against the 035E 
strain, the strain from which the purified proteins were made (Table XVIII). The bactericidal 
activity of the absorbed sera was also reduced by at least three fold when assayed against the a 
heterologous strain 1230-359. Absorption using UspAl resulted in greater reduction of the 
15 bactericidal titer against the heterologous strain in 3 out of 6 samples compared to absorptions 
using UspA2 (Table XVIII). This result was consistent with the difference in the reductions of 
ELISA titers to the UspAl after absorption with the two proteins. Absorption using the 
combined proteins UspAl and UspA2 did not result in further reduction of the bactericidal 
activity compared to UspAl alone. AH" six human sera contained antibodies to a 74 kDa OMP 
20 from M. caiarrhalis as determined by western blot analysis, and absorption using the purified 
74 kDa protein did not affect the bactericidal activity of either the 035E strain or the 1230-357 
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strain. This indicated that antibodies to the UspA proteins were the major source of the 
bactericidal activity against M. catarrhalis in adult sera. 

TABLE XVIII 



Bactericidal titer of the adult human sera before and after absorption 1 



Adsorbent 




Bactericidal titer to 035E strain 


in sample'' 








#2 


#3 


#4 


#5 


#6 


saline 


450 


>1350 


450 


50 


150 


450 


UspAl 


<50 


<50 


<50 


<50 


<50 


<50 


UspA2 


<50 


150 


<50 


<50 


<50 


<50 






Bactericidal titer to 1230-359 strain b 




saline 


450 


4050 


>1350 


150 


150 


450 


UspAl 


50 


150 


<50 


<50 


50 


150 


UspA2 


150 


1350 


450 


<50 


50 


50 



5 a Sera were the same as those described in Table XVII. 

b Bactericidal titer: The bactericidal activity was measured against the 035E or 1230-359 
strains with 3-fold diluted sera starting at 1 :50. The highest serum dilution resulting in 50% or 
greater killing was determined as the bactericidal titer. The purified UspAl and UspA2 proteins 
used for absorption were made from the 035E strain. 

10 

Because only small volumes of the children sera were available, absorption of these sera 
was done using a mixture of UspAl and UspA2 proteins. Absorption resulted in the complete 
loss or a significant reduction of bactericidal activity in four out of seven sera (Table XIX). The 
four sera including three from two month old children all had an initial bactericidal titer of 200 
15 or greater prior to absorption. The other three sera, which did not show a change in bactericidal 
titer upon absorption, all had a marginal titer of 50 before absorption. The reduction in ELISA 
reactivity to the UspA proteins after absorption confirmed that the antibody concentration had 
been reduced. This suggested that antibodies specific for the UspAl and UspA2 proteins in 
children's sera were also a major source of the bactericidal activity towards M catarrhalis. 
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TABLE XIX 



Bactericidal activity of children's sera before and after absorption with 
pooled purified UspAJ and UspA2 1 



Sample 


Age 


Unabsorbcd scrum 




Absorbed scrum 


(months) 




BC titer" 




BC titer 1 


1 


2 


0.84 


200 


0.29 


<50 


2 


2 


0.93 


200 


0.19 


<50 


3 


2 


0.98 


500 


0.38 


50 


4 


18 


0.88 


200 


0.43 


50 


5 


15 


0.66 


50 


0.25 


50 


6 


18 


0.62 


50 


0.32 


50 


7 


15 


0.68 


50 


0.35 


50 



a Absorption: Each serum was absorbed with a mixture of UspAl and UspA2 proteins from 



5 035E strain at final protein concentrations of 200, 50 or 20 (ig/ml. The same result was seen 
for all three absorptions of each sample. Only the data from the assay using 20 ^ig/ml of protein 
are shown. 

b A 4l5 : The absorbance at 415 nm in ELISA using the mixture of UspAl and UspA2 as 
detection antigen. Sera were tested at a 1 :300 dilution. 
10 C BC titer: Highest serum dilution resulting in 50% or greater killing of the 035E strain in the 
assay. Sera were assayed at dilutions 1 :50, 200, and 500. 

Affinity purified antibodies to UspAl and UspA2: To confirm their cross-reactivity and 
bactericidal activity, antibodies to UspAl or UspA2 from adult plasma were isolated by an 

15 affinity purification procedure. The purified antibodies reacted specifically with the UspAl and 
the UspA2 proteins but not with non-UspA proteins in the 035E lysates in a western blot assay. 
The purified antibodies to one protein also reacted to the other with almost equivalent titer in 
ELISA (Table XX). Both antibody preparations exhibited reactivity with five M. catarrhalis 
strains in the whole-cell ELISA and bactericidal assay (Table XXI). The bactericidal titers 

20 against all five M catarrhalis strains ranged between 400 and 800 ? which was equivalent to 
0.25-0.50 |LLg/mI of the protein in the purified antibody preparations (Table XXI). 
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TABLE XX 



Cross-reactivity of affinity purified human antibodies to UspAl and UspA2 in 

ELISA 



Antibodies purified to" 


IgG titers against" 






UspAl 


UspA2 


UspAl 


50,468 


20,088 


UspA2 


53,106 


52,834 



a The antibodies were purified from plasma pooled from two healthy adults by immune elution 
5 using purified UspAl or UspA2 from the 035E strain immobilized on nitrocellulose 

membrane. 

b ELISA end point titers are the highest antibody dilutions giving an A 4I5 greater than three 
times the background. 

TABLE XXI 

10 Whole cell ELISA titer and bactericidal titer of affinity purified human 

antibodies to UspAl and UspA2 a 



Assay 


Whole cell ELISA titer" 


BC titer 1 


strain 


Ab to UspAl 


Ab to UspA2 


Ab to UspAl 


Ab to UspA2 


035E 


12,553 


9,939 


400 


800 


ATCC25238 


30,843 


29,512 


400- 


400 


TTA24 


51,511 


57,045 


800 


800 


216:96 


31,140 


23,109 


400 


400 


1230-359 


8,495 


16,458 


800 


800 



The purified antibody preparations were the same as described in Table XX. The specific 
reactivities of the purified antibodies to UspA proteins, but not other outer membrane 
proteins, were confirmed by western blots. 
15 b ELISA end point titers are the highest antibody dilutions giving an A 4I5 greater than three 
times the background when assayed against whole bacterial cells. 
C BC titer: Highest antibody dilution resulting in 50% or greater killing of the bacterial 
inoculum in the assay. Antibodies (120 jag/ml) were assayed at dilutions 1:100, 200, 400, 
and 800. 
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Discussion 

Previous studies examining human antibodies to M. catarrhalis whole cells or outer 
membrane proteins usually focused on a single age group. Further, the biological function of 
the antibodies was left largely undetermined (Chapman et a/.. 1985), and the antigens eliciting 
the functional antibodies were not identified. Thus, these previous studies did not provide 
information as to the role of naturally acquired antibodies in protection against M catarrhalis 
diseases, nor did they provide clear information as to what antigens are suitable for vaccine 
development. The data from this study indicate that the IgG antibodies to UspAl and UspA2 
are present in normal human sera and their levels are age-dependent. These antibodies are an 
important source of serum bactericidal activity in both children and adults. 

These data indicated that most children had serum IgG antibodies to both UspAl and 
UspA2 at two months of age although the level varied from individual to individual, and the 
IgG subclass profile in these infant sera was similar to that in adult sera. The infant sera had 
bactericidal activity. The absorption studies suggested that the bulk of the bactericidal 
antibodies in these sera were directed against the UspAl and the UspA2 proteins. These results 
suggest that the IgG antibodies detected in the two month old children are of maternal origin. 
This is consistent with the report that umbilical cord serum contains high titers of antibodies to 
an extract of M. catarrhalis whole cells (Ejlertsen et al. 9 1994b). 

Due to the lack of clinical information on the study subjects and small number of 
subjects examined in this study, it could not be determined whether maternal antibodies against 
UspA, although bactericidal in vitro, were protective in young children. However, at two 
months of age the children had significantly higher scrum IgG titers against the UspA proteins 
and only a few of these children had a low level of IgA antibodies to M catarrhalis as 
compared to children at 15-18 months of age. If serum IgA reflects prior mucosal exposure to 
the bacterium, then most of the children are not infected by M. catarrhalis in the first few 
months of age. One of the reasons may be that the maternal antibodies present in the young 
children protect them from infection" at this age. This is consistent with the finding that young 
children seldom carry this bacterium and do not develop M. catarrhalis disease during the first 
months of life (Ejlertsen et ai, 1994a). 
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Children may become susceptible to M catarrhalis infection as maternal antibodies 
wane. In this study, the sera from 6 to 7 month old children had the lowest level of IgG 
antibodies to the UspA proteins and barely detectable bactericidal titers against whole cells of 
M catarrhalis. By 15 months of age, nearly all children had serum IgA antibodies to the UspA 
proteins, and the level of IgA antibodies had significantly increased along with the level of IgG 
antibodies and bactericidal activity when compared with children of 6 to 7 months of age. This 
suggested that these children had been exposed to the bacterium and mounted an antibody 
response. The fifteen sera from the group of 18-36 month old children all had IgG and IgA 
titers to the UspA proteins and the bactericidal titers varied greatly. The UspA specific IgG 
antibodies in the older children's sera had different characteristics than the antibodies from the 
two month old children. First, the IgGl antibody titer was significantly higher than the IgG3 
titer in children's sera, while the opposite was true for the 2 month old children (FIG. 10). 
Second, most sera from 2 month old children had bactericidal activity, while bactericidal 
activity was barely detectable in the sera from children of 6 months or older. The low antibody 
level and the low serum bactericidal activity seen in children between 6-36 months of age is 
consistent with the epidemiological findings that children of this age group have the highest 
colonization rate and highest incidence of M. catarrhalis disease (Bluestone, 1986; Ejlertsen et 
al., 1994b; Leinonen et al, 1981; Roitt et ai, 1985: Ruuskanen and Heikkinen, 1994; Sethi et 
al, 1995; Teele et al., 1989). 

Adults, a population usually resistant to M. catarrhalis infections (Catlin. 1990; 
Ejlertsen et al, 1994a), were found to have consistently higher levels of IgG antibodies to the 
UspA proteins as well as higher serum bactericidal activity than children. The bactericidal 
activity of the adult sera was clearly antibody-mediated since immunoglobulin depleted sera 
had no activity (Chen et al, 1996), and the antibodies purified from adult plasma exhibited 
complement dependent bactericidal activity. The antibodies purified from human sera using 
UspAl or UspA2 from a single isolate exhibited killing against multiple strains. This result 
indicates that humans developed bactericidal antibodies toward the conserved epitopes of UspA 
proteins in response to natural infections. 

In all adult samples, the IgG antibodies were primarily of the IgGl and IgG3 subclasses 
with IgG3 being higher. This is consistent with previous reports that the IgG3 subclass is a 
major constituent of the immune response to M catarrhalis in adults and children greater than 4 
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years of age, but not in younger children (Carson et aL, 1994; Goldblatt et ai, 1990). Of the 
four IgG subclasses in humans. IgG3 constitutes only a minor component of the total 
immunoglobulin in serum. However, lgG3 antibody has the highest affinity to interact with 
Clq, the initial step in the classic complement pathway leading to elimination of the bacterium 
5 by both complement-dependent killing and opsono-phagocytosis (Roitt et ai, 1985). Since 
IgG3 antibody is efficiently transferred across the placenta, it may also confer protective 
immunity to infants. The data from this study indicate that IgG3 antibody to the UspA proteins 
is an important component of the immune response to natural infection and has in vitro 
biological activity. 

10 As clinical information related to M catarrhalis infection was not collected for the 

study subjects, it is unknown how the antibodies to UspAl or UspA2 were induced. When 
antibodies made against the UspA proteins in guinea pigs were tested for reactivity with other 
bacterial species, including Pseudomonas aeruginosa, Neisseria meningitidis, Neisseria 
gonorrhoeae, Bordetella pertussis. Escherichia coli, and nontypable Haemophilus influenzae 

15 by western blot, no reactivity was detected. This suggests that the antibodies were elicited as a 
specific response to the UspA antigens of M catarrhalis. This is consistent with the high 
colonization rate and the endemic nature of this organism in human populations. Since the 
affinity purified antibodies to the two UspA proteins were cross-reactive, it could not be 
determined whether the human antibodies were elicited by one or both proteins. It seemed clear 

20 that the shared sequence between these two proteins was the main target of the bactericidal 
antibodies. 

In summary, this study demonstrated that antibodies to the two UspA proteins are 
present in nearly all humans regardless of age. The overall level and subclass distribution of 
these antibodies, however, were age-dependent. IgG antibodies against UspAl and UspA2 

25 were cross-reactive, and are a major source of serum bactericidal activity in adults. The level of 
these antibodies and serum bactericidal activity appears to correlate with age-dependent 
resistance to M. catarrhalis infection. Since humans make an antibody response to many other 
M. catarrhalis antigens in addition to UspAl and UspA2 after natural infection, it remains to be 
determined if immunization with one or both UspA proteins will confer adequate protection in 

30 susceptible populations. 
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EXAMPLE VI: UspA2 as a Carrier for Oligosaccharides 

UspA2 as a pneumococcal saccharide carrier. 

This study demonstrates that UspA2 can serve as a carrier for a pneumococcal 
saccharide. A seven valent pneumococcal polysaccharide was conjugated to UspA2 by 
reductive amination. Swiss Webster mice were immunized on vvk 0 and wk 4 and a final bleed 
taken on wk 6. Each mouse was immunized subcutaneous! y (s.c.) in the abdomen with 1 jag 
carbohydrate per dose with aluminum phosphate as the adjuvant. A group of mice was 
immunized with the PP7F- CRM conjugate as a control. The data for the sera from the 6 wk 
bleed are shown in Table XXII, Table XXIII, and Table XXIV. The conjugate elicited 
antibodies against both the polysaccharide as well as bactericidal antibodies to M. catarrhalis. 
These results demonstrate that UspA2 can serve a carrier for eliciting antibodies to this 
pneumococcal saccharide and retain its immunogenicity to UspA2. 

TABLE XXII 



Titers elicited by 7F conjugates to the pneumococcal polysaccharide 7F 



Antigen 


IgG ELISA titer to Pn Ps 7F* 


PP7F-UspA2 mix 


<100 


PP7F-UspA2 conjugate 


9,514 


PP7F-CRM conjugate 


61,333 


*PooI of sera from five mice. 



TABLE XXIII 



ELISA titers of sera against whole cells of three M. catarrhalis isolates 



Immunogen 




Strain Tested 




Group 1 


035E 


430-345 


1230-359 


PP7F-UspA2^ mix 


51,409 


4,407 


9,124 


PP7F CRM conjugate 


56 


49 


47 


PP7F UspA2 conjugate 


31,1 1 1 


3,529 


8,310 . 



Vaccine group consists of 5 Swiss- Webster mice. Each group immunized at vvk 0 and vvk 3 



and serum collected at wk 6. 
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2 Vaccine composed of I yxg Pneumo Type 7F and 1 jig UspA2 adjuvanted with aluminum 
phosphate. 

TABLE XXIV 



Complement dependent bactericidal antibodies against three t\l. catarrltalis isolates 



Immunogen 




Strain Tested 




Group 1 


035E 


430- 345 


1230- 359 


PP7F- UspA2 mix 


400 


400 


400 


PP7F CRM conjugate 


<100 


<100 


<100 


PP7F UspA2 conjugate 


400 


400 


200 



'BC 50 titer is highest serum dilution at which >50% of bacteria were killed as compared to 



serum from wk 0 mice. The most concentrated serum tested was a 1 : 100 dilution. 

UspA2 as an Haemophilus b Oligosaccharide Carrier. 

This study demonstrates that UspA2 can serve as a carrier for an Haemophilus 
influenzae type b oligosaccharide (HbO). An HbO sample (average DP=24) was conjugated to„ 
UspA2 by aqueous reductive amination in the presence of 0.1% Triton X-100. The ratio of the 
HbO to UspA2 was 2:1 by weight. Conjugation was allowed to proceed for 3 days at 35°C and 
the conjugate diafiltered using an Amicon 100K cutoff membrane. The conjugate ratio (mg 
carbohydrate/mg UspA2) was 0.43:1. The carbohydrate was determined by orcinal assay and 
the protein by Lowry. The number of hydroxy-ethyl lysines was determined by amino acid 
analysis and found to be 12.6. 

The immunogenicity of the conjugate was examined by immunizing Swiss- Webster 
mice. The mice were immunized twice on wk 0 and wk 4 with 1 yig of carbohydrate. No 
adjuvant was used with the conjugate, but was used with UspA2. The sera were pooled and 
titered. The reactivity toward HbPS by the radioantigen binding assay (RABA) was similar to 
that seen when HbO is conjugated to CRM I97 (Table XXV). The whole cell titer toward the 
homologous M catarrhalis isolate (035E) was similar to that seen for non-conjugated USpA2 
(Table XXVI), as were the bactericidal titers (Table XXVII). Thus, when a carbohydrate 
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antigen that typically elicits a RABA titer less than 0.10 is conjugated to UspA2. it becomes 
immunogenic. 



TABLE XXV 

Comparison of immunogenicity of HbO conjugated to UspA2 to HbO conjugated to 
CRM I97 to Haemophilus b polysaccharide by Radioantigen Binding Assay (RABA) 



Week 


HbO-CRM l97 


Hbo-UspA2 


0 


<0.10 


<0.10 


3 


2.51 


2.87 


4 


4.46 


3.56 


6 


58.66 


18.92 




TABLE XXVI 




Comparison of immunogenicity of HbO-UspA2 conjugate with non-conjugated UspA2 by 


ELISA against whole cell of the 035E isolate to M. catarrhalis 


Week 


UspA2 a 


Hbo-UspA2 


0 


<50 


<50 


4 


54,284 


1 7,424 


6 


345,057 


561,513 


J 5 ug UspA2 adjuvanted 


with 500 ug aluminum phosphate. 






TABLE XXVII 




Bactericidal of sera toward two M. catarrhalis isolates. 


Isolate 


UspA2 a 


Hbo-UspA2 


035E 


4,500 


>4,500 


345 


n.d. 


450 



a 5 jug UspA2 adjuvanted with 500 j.ig aluminum phosphate, 
n.d. = not determined 



r 
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EXAMPLE VII: Association of mouse serum sensitivity with expression 

of mutant forms of UspA2 

When bacteria are killed in the presence of serum that lack specific antibodies toward 
5 them, it is called "serum sensitivity." In the case of M. caiarrhalis. the mutants lacking an 
intact UspA2 protein have been found to be serum sensitive. These mutants were constructed 
so that one (035E.1; refer to Example IX for a description of isolates 035E.1, 035E.2 and 
035E.12) did not express UspAl, one (035E.2) did not express UspA2, and one (035E.12) did 
not express either protein based on a lack of reactivity with the 17C7 monoclonal antibody. 
10 The 035E.2 and 035E.12, however, expressed a smaller truncated form UspA2 (tUspA2) that 
reacts with antibodies prepared by immunizing mice with purified UspA2. The tUspA2 could 
be detected in a western blot of bacterial lysates using either polyclonal anti-UspA2 sera or the 
MAb 13-1. The size of the smaller form was consistent with the gene truncation used for the 
construction of the two mutants. 

15 This bactericidal capacity was tested by mixing the non-immune mouse sera, a 1:5 

dilution of human complement and a suspension of bacteria (Approx. 1000 cfu) in the wells of a 
microtiter plate. The mouse sera were tested at both a 1 :50 and 1 TOO dilution. The number of 
surviving bacteria was then determined by spreading a dilution of this bacterial suspension on 
agar growth medium. The killing was considered significant when fewer than 50% viable 

20 bacteria as cfu's were recovered relative to the samples without mouse sera. Killing by the 
non-immune sera was seen only for the mutants lacking a "complete" UspA2 (Table XXVIII). 

TABLE XXVIII 
Bactericidal activity of the pre- immune sera from Balb/c mice 



Mutant Proteins Expressed Bactericidal Activity of Normal 

Mouse Sera 



035E UspAl & UspA2 

035E.1 UspA2 

035E.2 UspAl &tUspA2 + 

035E.12 tUspA2 - " + 
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EXAMPLE VIII: Identification of a Decapeptide Epitope in 
UspAl that Binds MAb 17C7 

It was clear from the work with different strains of M catarrhalis and analyses of their 
protein sequences of UspAl that certain epitopic regions must exist which are similarjf not 
identical, in all of the strains and provide the basis of the immunogenic response in humans. In 
order to identify such immunogenic epitope(s), peptides spanning the UspAl region known to 
contain the binding site for MAb 17C7 were prepared and examined for their ability to bind to 
MAb 17C7. 

Specifically, overlapping synthetic decapeptides, as shown in Table XXIX and FIG. 12, 
that were N-terminally bound to a membrane composed of derivatized cellulose were obtained 
from Research Genetics Inc. (Huntsville. AL). After five washes with PBS-Tween containing 
5% (w/v) non-fat dry milk, the membrane was subsequently incubated with MAb I7C7 (in the 
form of hybridoma culture supernatant) overnight at 4°C. Following three washes with PBS- 
Tween, the membrane was incubated overnight at 4°C with gentle rocking with 10 6 cpm of 
radioiodinated (specific activity 2 x 10 7 cpm/|ag protein), affinity-purified goat anti-mouse 
immunoglobulin. The membrane was then washed as before and exposed to X-ray film (Fuji 
RX safety film, Fuji Industries, Tokyo, Japan). 
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TABLE XXIX 
Decapeptides Used to Identify Binding Site for MAb 17C7 



PEPTIDE # PEPTIDE SEQUENCE 



9 


SGRLLDQKAD 


SEQ IDNO:81 


10 


QKADIDNNIN 


SEQ IDNO:82 


11 


NNINNIYELA 


SEQ ID NO:83 


12 


NNIYELAQQQ 


SEQ ID NO:84 


13 


YELAQQQDQH 


SEQ ID NO: 18 


14 


AQQQDQHSSD 


SEQ ID NO:85 


15 


QDQHSSDIKT 


SEQ ID NO:86 


16 


HSSDIKTLKN 


SEQ IDNO:87 


17 


DIKTLKNNVE 


SEQIDNO:88 


18 


TLKNNVEEGL 


SEQ IDNO:89 


19 


EEGLLDLSGR 


SEQ IDNO:90 


20 


LSGRLIDQKA 


SEQ IDNO:91 


21 


DQKADIAKNQ 


SEQ IDNO:92 


22 


AKNQADIAQN 


SEQ IDNO:93 


23 


IAQNQTDIQD 


SEQ IDNO:94 


24 


DIQDLAAYNE 


SEQ ID NO:95 



It is clear from the dot blot results shown in the autoradiograph (FIG. 13) that peptide 
13, YELAQQQDQH (SEQ ID NO: 18) exhibited optimal binding of MAb 17C7 with peptide 14 
(SEQ ID NO:85) exhibiting less than optimal binding. This same peptide (SEQ ID NO: 18) is 
present in UspA2 which explains why both proteins bind to MAb 17C7. 

Interestingly, peptide 12 shows no binding and binding by peptides 15, 16, 19, 22, 23 is 
probably non-specific. Thus, a comparison of peptides 12, 13, and 14 yields the conclusion that 
the 7-mer AQQQDQH (SEQ ID NO: 17) is an essential epitope for MAb 17C7 to bind to 
UspAl and UspA2. This conclusion is in agreement with the current understanding that an 
immunogenic epitope may comprise as few as five, six or seven amino acid residues. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



Example IX: Phenotypic Effect of Isogenic uspAl and uspA2 Mutations on 

M catarrhalis Strain 035E 

Materials and Methods 

Bacterial strains, plasmids and growth conditions . The bacterial strains and plasmids 
used in this study are listed in Table XXX. M. catarrhalis strains were routinely grown at 37°C 
on Brain-Heart Infusion (BHI) agar plates (Difco Laboratories, Detroit, MI) in an atmosphere of 
95% air-5% CO. supplemented, when necessary, with kanamycin (20 (.ig/ml) (Sigma Chemicals 
Co., St. Louis. MO) or chloramphenicol (0.5 jag/ml) (Sigma), or in BHI broth. The BHI broth 
used to grow M catarrhalis cells for attachment assays was sterilized by filtration. Escherichia 
coli strains were cultured on Luria-Bertani (LB) agar plates (Maniatis et al. 7 1982) 
supplemented, when necessary, with ampicillin (100 |ug/ml). kanamycin (30 j^g/ml). or 
chloramphenicol (30 j^g/ml). 

TABLE XXX 
Bacterial Strains and Plasmids Used in this Study 
Strain or plasmid Description Source or reference 

M. catarrhalis 

03 5E Wild-type isolate from Helminen et al, 1994 

middle ear fluid 

035E.1 Isogenic mutant of 035E Aebi et al, 1997 

with a kan cartridge in the 
uspAl structural gene 

035E.2 Isogenic mutant of 035E Aebi et al., 1997 

with a kan cartridge in the 
uspA2 structural gene 
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TABLE XXX (Continued) 


Strain ur plasmid 


Description 


Source or reference 


035E.12 


Isogenic mutant of 035E 
with a kan cartridge in the 
itspA2 structural gene and a 
cat cartridge in the uspAI 
structural gene 


This study 


P-44 


Wild-type isolate that 
exhibits rapid 
hemagglutination 


Soto-Hernandez et al., 1989 


P-48 


Wild-type isolate that 
exhibits slow 
hemagglutination 


Qr*\i/-\ tit ft / 1 QQO 

ooto-riernanuez ci ai.* i vov 


Escherichia coli 






DH5a 


Host for cloning studies 


Stratagene 


Plasmids 






pBluescript II 


Cloning vector; Amp r 


Stratagene 


pUSPAJ 


pBluescript II SK+ with a 
2.7 kb insert containing 
mnst of the usnA I £iene of 
M catarrhal is strain 035E 


Aebi et aL, 1997 


pUSPA I CAT 


pUSPAl with a cat cartridge 
replacing the 0.6 kb BglU 
fragment of the uspAI gene 


This study 



Characterization of outer membrane proteins . Whole cell lysates and outer membrane 
vesicles of M. catarrhalis strains were prepared as described (Murphy and Loeb, 1989; Patrick 
5 et al, 1987). Proteins present in these preparations were resolved by SDS-PAGE and detected 
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by staining with Coomassie blue or by western blot analysis as described (Helminen etaL % 
1993a). 

Monoclonal antibodies (MAbs) . MAb 17C7, a murine IgG antibody that reacts with a 
5 conserved epitope of both UspAl and UspA2 from M. catarrhalis strain C)35E. as described in 
earlier examples herein, was used for immunologic detection of these proteins, MAb 17C7 was 
used in the form of hybridoma culture supernatant fluid in western blot analysis and in the 
indirect antibody-accessibility assay. MAb 3F12, an IgG MAb specific for the major outer 
membrane protein of Haemophilus ducreyi (Klesney-Tait et al , 1997), was used as a negative 
10 control in the indirect antibody-accessibility assay. 

Molecular cloning methods . Chromosomal DNA of M. catarrhalis strain 035E was 
used as the template in a polymerase chain reaction (PCR™) system together with 
oligonucleotide primers derived from either just after the start of the strain 035E uspAl open 

15 reading frame (i.e., PI in FIG. 14) or just after the end of this open reading frame (i.e., P2 in 
FIG. 14). These primers were designed to contain a BamH\ restriction site at their 5'-end. The 
sequence of these primers was; 

PI - 5'-CGGGATCCGTGAAGAAAAATGCCGCAGGT-3' (SEQ ID NO:96); 
P2 - 5'-CGGGATCCCGTCGCAAGCCGATTG-3' (SEQ ID NO:97). 

20 DNA fragments were amplified using a PTC 100 Programmable Thermal Controller (MJ 
Research, Inc., Cambridge, MA) and the GeneAmp PCR™ kit (Roche Molecular Systems, Inc., 
Branchburg, NJ). PCR™ products were extracted from 0.7% agarose gel slices using the Qiaex 
Gel Extraction Kit (Qiagen, Inc., Chadsworth, CA) and digested with BamYW (New England 
Biolabs, Inc., Beverly, MA) for subsequent ligation into the BamHl site of pBIuescript II SK+ 

25 (Stratagene, La Jolla, CA). Ligation reactions were performed with overnight incubation at 
16°C using T4 DNA ligase (Gibco BRL, Inc., Gaithcrsburg, MD). Competent E. coli DH5a 
cells were transformed with the ligation reaction mixture according to a standard heat-shock 
procedure (Sambrook et aL , 1989) and the desired recombinants were selected by culturing in 
the presence of an appropriate antimicrobial compound. The 1.3 kb chloramphenicol (cat) 

30 resistance cartridge was prepared by excision (using Banilll) from pUCAECAT (Wyeth- 
Lederle, Rochester, NY). The cat cartridge was subsequently ligated into BglW restriction sites 
located in the mid-portion of cloned segment from the uspAl gene and, after transformation of 
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competent E. coli DH5 cells, recombinant clones were identified by selection on solidified 
media containing chloramphenicol. 

Transformation of M catarrhalis . The electroporation method used for transformation 
5 of M catarrhalis strain 035E has been described in detail (Helminen et ai, 1993b). Briefly, a 
30-ml portion of a logarithmic-phase broth culture (10 9 colony forming units [cfu]/ml) was 
harvested by centrifligation, washed three times with 10% (v/v) glycerol in distilled water, and 
resuspended in 100 \x\ of the same solution. A 20-^1 portion of these cells was electroporated 
with 5 \ig of linear DNA (i.e., the truncated uspAl gene containing the cat cartridge) in 5 Lil of 
10 water in a microelectroporation chamber (Cel-Porator Electroporation system; Bethesda 
Research Laboratories, Gaithersburg, MD) by applying a field strength of 16.2 kV over a 
distance of 0.15 cm. Following electroporation, the cell suspension was transferred to 1 ml of 
BHI broth and incubated with shaking at 37°C for 90 min. Ten 1 00-j.tl portions were then 
spread on BHI agar plates containing the appropriate antimicrobial compound. 

15 

Southern blot analysis . Chromosomal DNA purified from wild-type and mutant 
M. catarrhalis strains strains was digested with either Pvu\\ or HindlU (New England Biolabs) 
and Southern blot analysis was performed as described (Sambrook etal, 1989). Double- 
stranded DNA probes were labeled with 32 P by using the Random Primed DNA Labeling Kit 
20 (Boehringer-Mannheim, Indianapolis, IN). 

Indirect antibodv-accessibilitv assay . Overnight BHI broth cultures of M catarrhalis 
strain 035E and its isogenic mutants were diluted in PBS buffer containing 10% (v/v) fetal 
bovine serum and 0.025% (w/v) sodium azide (PBS-FBS-A) to density of 1 10 Klett units (ca. 

25 10 9 cfu/ml) as measured with a Klett-Summerson colorimeter (Klett Manufacturing Co., New 
York, NY). Portions (100 \\X) of this suspension were added to 1 ml of MAb 17C7 or MAb 
3F12 culture supernatant. After incubation at 4°C for one hour with gentle agitation, the 
bacterial cells were washed once and suspended in 1 ml of PBS-FBS-A. Affinity-purified goat 
anti-mouse immunoglobulin, radiolabeled with 125 I to a specific activity of 10 s cpm per |_ig, was 

30 added and the mixture was incubated for one hour at 4°C with gentle agitation. The cells were 
then washed four times with 1 ml of PBS-FBS-A, suspended in 500 Ltl of triple detergent 
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(Helminen et aL, 1993a) and transferred to glass tubes. The radioactivity present in each 
sample was measured by using a gamma counter. 



Autoagglutination and hemagglutination assays . The ability of M. catarrhal is strains to 
5 autoagglutinate was assessed using bacterial cells grown overnight on a BHI agar plate. These 
cells were resuspended in PBS to a turbidity of 400 Klett units in a glass tube and subsequently 
allowed to stand at room temperature for ten minutes at which time the turbidity of this 
suspension was again determined. Rapid and slow autoagglutination were defined as turbidities 
of less that and greater than 200 Klett units, respectively, after 10 minutes. The 
10 hemagglutination slide assay using heparinized human group O Rh* erythrocytes was 
performed as previously described (Soto-Hernandez et a/., 1989). 

Serum bactericidal assay . Complement-sufficient normal adult human serum was 
prepared by standard methods. Complement inactivation was achieved by heating the serum for 

15 30 min at 56°C. A M. catarrhalis broth culture in early logarithmic phase was diluted in 
Veronal-buffered saline containing 0.10% (w/v) gelatin (GVBS) to a concentration of 1-2 x 10^ 
cfu/ml, and 20 jal portions were added to 20 \x\ of native or heat-inactivated normal human 
serum together with 160 \x\ of Veronal-buffered saline containing 5 mM MgCl 2 and 1.5 mM 
CaCU. This mixture was incubated at 37°C in a stationary water bath. At time 0 and at 15 and 

20 30 min, 10 \x\ aliquots were removed, suspended in 75 )_il of BHI broth and spread onto 
prewarmed BHI agar plates. 

Adherence assay . A method used to measure adherence of Haemophilus influenzae to 
Chang conjunctival cells in vitro (St. Geme III and Falkow, 1990) was adapted for use with 

25 M. catarrhalis. Briefly, 2-3 x 10 5 HEp-2 cells (ATCC CCL 23) or Chang conjunctival cells 
(ATCC CCL 20.2) were seeded into each well in a 24-well tissue culture plate (Corning-Costar) 
and incubated for 24 h before use. A 0.3 ml volume from an antibiotic-free overnight culture of 
M. catarrhalis was inoculated into 10 ml of fresh BHI medium lacking antibiotics and this 
culture was subsequently allowed to grow to a concentration of approximately 5 x 10 8 cfu/ml 

30 (120 Klett units) with shaking in a gyrotory water bath. The culture was harvested by 
centrifugation at 6,000 x g at 4-8°C for 10 min. The supernatant was discarded and a Pasteur 
pipet was used to gently resuspend the bacterial cells in 5 ml of pH 7.4 phosphate-buffered 
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saline (PBS) or PBS containing 0.15% (w/v) gelatin (PBS-G). The bacterial cells were 
centrifuged again and this final pellet was gently resuspended in 6-8 ml of PBS or PBS-G. 

Portions (25 jal) of this suspension (10 ? CPU) were inoculated into the wells of a 24- 
vvell tissue culture plate containing monolayers of HEp-2 or Chang cells. These tissue culture 
plates were centrifuged for 5 min at 165 x g and then incubated for 30 min at 37°C. Non- 
adherent bacteria were removed by rinsing the wells gently five times with PBS or PBS-G ? and 
the epithelial cells were then released from the plastic support by adding 200 ul of PBS 
containing 0.05% trypsin and 0.02% EDTA. This cell suspension was serially diluted in PBS 
or PBS-G and spread onto BHI plates to determine the number of viable M catarrhalis present. 
Adherence was expressed as the percentage of bacteria attached to the human cells relative to 
the original inoculum added to the well. 

Results 

Construction of an isogenic M. catarrhalis mutant lacking expression of both UspAl 
and UspA2 . Construction of M. catarrhalis mutants lacking the ability to express either UspAl 
(mutant strain 035E.1) or UspA2 (mutant strain 035E.2) has been described in previous 
examples (Aebi et al, 1997). For constructing a double mutant that lacked expression of both 
UspAl and UspA2, the 0.6 kb Bgl\\ fragment of pUSPAl (FIG. 14A) was replaced by a cat 
cassette, yielding the recombinant plasmid pUSP/UCAT. Using the primers PI and P2 ? the 3.2 
kb insert of pUSPAlCAT was amplified by PCR™. This PCR™ product was used to 
electroporate the kanamycin-resistant uspA2 strain 035E.2 and yielded the chloramphenicol- 
and kanamycin-resistant transformant 035E.12, a putative uspAl uspAl double mutant. 

Southern blot analysis was used to confirm that strains 035E.1, 035E.2, and 035E.12 
were isogenic mutants and that allelic exchange had occurred properly, resulting in replacement 
of the wild-type uspAl or uspA2 gene, or both, with the mutated allele. Chromosomal DNA 
preparations from the wild-type parent strain 035E. the uspAl mutant 035E.1, the uspA2 
mutant 035E.2, and the putative uspAl uspAl mutant strain 035E.12 were digested to 
completion with PvuW and probed in Southern blot analysis with DNA fragments derived from 
these two M catarrhalis genes or with the kan cartridge. For probing with the cat cartridge, 
chromosomal DNA from strain 035E. 12 was digested with HincHW. 
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The uspA /-specific DNA probe was obtained by PCR™-based amplification of 
M. catarrhalis strain 035E chromosomal DNA using the primers P3 and P4 (FIG. 14A). A 
500-bp «.y/?/J2-specific DNA fragment was amplified from 035E chromosomal DNA by PCR™ 
5 with the primers P5 and P6 (FIG. 14B). Use of these two gene-specific probes together with the 
kan and cat cartridges in Southern blot analysis confirmed that strain 035E.I2 was a uspAl 
uspA2 double mutant. 

Characterization of selected proteins expressed by the wild-type and mutant 
10 M. catarrhalis strains . Proteins present in outer membrane vesicles extracted from the the wild- 
type and these three mutant strains were resolved by SDS-PAGE and either stained with 
Coomassie blue (FIG. 15 A) or probed with MAb 17C7 in western blot analysis (FIG. 15B). 
The wild-type parent strain 035E possessed a very high molecular weight band detectable by 
Coomassie blue staining (FIG. 15 A, lane 1, closed arrow) that was also similarly abundant in 
15 the uspAl mutant 035E.1 (FIG. 15 A, lane 2). The uspAl mutant 035E.2 (FIG. ISA, lane 3) 
had a much reduced level of expression of a band in this same region of the gel; this band was 
not visible at all in the uspAl uspAl double mutant 035E.12 (FIG. 2, panel A, lane 4). 

Western blot analysis revealed that the wild-type strain (FIG. 15B, lane 1) expressed 
20 abundant amounts of MAb 17C7-reactive antigen, most of which had a very high molecular 
weight, in excess of 220,000. The wild-type strain also exhibited discrete antigens with 
apparent molecular weights of approximately 120,000 and 85,000 which bound this MAb (FIG. 
15B, lane 1, open and closed arrows, respectively). The uspAl mutant 035E.1 (FIG. 15B, lane 
2) lacked expression of the 120 kDa antigen, which was proposed to be the monomeric form of 
25 UspAl, but still expressed the 85 kDa antigen. The amount of very high molecular weight 
MAb 17C7-reactive antigen expressed by this uspAl mutant appeared to be equivalent to that 
expressed by the wild-type strain. The uspAl mutant 035E.2 (FIG. 15B, lane 3) expressed the 
120 kDa antigen but lacked expression of the 85 kDa antigen which was proposed to be the 
monomeric form of the UspA2 protein. In contrast to the uspAl mutant, the UspAl mutant had 
30 relatively little very high molecular weight antigen reactive with MAb 1 7C7. Finally, the 
uspAl uspA2 double mutant 035E.12 (FIG. 15B, lane 4) expressed no detectable MAb 17C7- 
reactive antigens. 
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Binding of M Ab 17C7 to whole cells of the wild-type and mutant strains . The indirect 
antibody-accessibility assay was used to determine whether both UspAl and UspA2 are 
exposed on the surface of M catarrhalis and accessible to antibody. Whole cells of both the 
5 wild-type strain 035E and the uspAl mutant 035E. 1 bound similar amounts of MAb 17C7 
(Table XXXI). This result suggested that UspA2 is expressed on the surface of M. catarrhalis, 
or at least on the surface of the uspAl mutant. The uspA2 mutant 035E.2 bound substantially 
less MAb 17C7 than did the wild-type strain, but the level of binding was still at least an order 
of magnitude greater than that obtained with an irrelevant IgG Mab directed against a 
10 hi ducreyi outer membrane protein (Table XXXI). As expected from the western blot analysis, 
the uspAl uspA2 double mutant 035E.12 did not bind MAb 17C7 at a level greater than 
obtained with the negative controls involving the H. chtcreyi-specific MAb (Table XXXI). 

TABLE XXXI 
Binding of MAb 17C7 to the Surface of Wild-Type 
15 and Mutant Strains of M. catarrhalis 

Binding' 1 of 

Strain MAb 17C7 MAb 3F12b 

035E (wild-type) 145,583" 4,924 

035E.1 (uspAJ mutant) 154,119 4,208 

035E.2 (uspA2 mutant) 96,721 4,455 

035E.12 (uspAJ uspA2 double mutant) 6,081 3,997 

a 1 25 

Counts per min of I-Iabeled goat anti-mouse immunoglobulin bound to 
MAbs attached to the bacterial cell surface, as determined in the indirect 
antibody-accessibility assay. 

b MAb 3F12, a murine IgG antibody specific for a H. ducreyi outer membrane 
20 protein (Klesney-Tait et al. y 1997), was included as a negative control. 

c The values represent the mean of two independent studies. 

Characterization of the urowth. autoauulutination, and hemagglutination properties of 
the wild-type and mutant strains . The colony morphology of these three mutant strains grown 
25 on BHI agar plates did not differ from that of the wild-type strain parent strain. Similarly, the 
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rate and extent of growth of all four of these strains in BHI broth were very similar if not 
identical (FIG. 16). In an autoagglutination assay performed as described in above in the 
Materials and Methods section of this example, all four strains exhibited the same rate of 
autoagglutination. Finally, there was no detectable difference between the wild-type parent and 
the three mutants in a hemagglutination assay using human group O erythrocytes (Soto- 
Hernandez et al., 1989). Control hemagglutination studies were performed using a pair of 
M. catarrhal is isolates (i.e., strains P-44 and P48) previously characterized as having rapid or 
slow rates, respectively, of hemagglutination (Soto-Hernandez et al. 9 1989). 

Effect of the uspAl and usvA2 mutations on the ability of M catarrhalis to adhere to 
human cells . Preliminary studies revealed that the wild-type M. catarrhalis strain 035E 
adhered readily to HeLa cells, HEp-2 cells, and Chang conjunctival cells in vitro. To determine 
whether lack of expression of UspAl or UspA2 affected this adherence ability, the wild-type 
and the three mutant strains were first used in an attachment assay with Hep-2 cells. In this set 
of studies, PBS was used as the diluent for washing the HEp-2 cell monolayers and for serial 
dilution of the trysinized HEp-2 cell monolayer at the completion of the assay. Both the wild- 
type strain and the uspA2 mutant 035E.2 exhibited similar levels of attachment to HEp-2 
monolayers (Table XXXI). The uspAl mutant 035E.1, however, was less able to adhere to 
these HEp-2 cells; lack of expression of UspAl reduced the level of attachment by 
approximately six-fold (Table XXXII). The uspAl uspAl double mutant 035E.12 exhibited a 
similarly reduced level of attachment (Table XXXII). 
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TABLE XXXII 



Adherence of Wild-Type and Mutant Strains of M. catarrhalis 
to HEp-2 and Chang Conjunctival Cells in vitro 





Adherence' 1 to 


Strain 


HEp-2 cclls b 


Chang cells c 


035E (wild-type) 


14.7 ±4.9 


51.4 ±30.8 


035E.1 {uspAl mutant) 


2.4 ± 0.9 (0.006 d ) 


0.8 ± 0.5 (0.002 d ) 


035E.2 {uspAl mutant) 


19.1 ±7.0 (0.213 d ) 


55.9 ± 16.7 (0.728 d ) 


035E.12 (uspAl uspAl double mutant) 


2.3 ±1.8 (0.01 l d ) 


0.6 ± 0.2 (0.002 d ) 



Adherence is expressed as the percentage of the original inoculum that was adherent to 



5 the human epithelial cells at the end of the 30 min incubation period. Each number 

represents the mean (± S.D.) of two independent studies. 

b PBS was used for washing of the monolayers and for serial dilutions of adherent 
M. catarrhalis. 

c PBS-G was used for washing of the monolayers and for serial dilutions of adherent 
10 M catarrhalis. 

d P value when compared to the wild-type strain 035E using the two-tailed Student t- 
test. 

Control studies revealed, however, that M catarrhalis cells did not survive well in the 
15 PBS used for washing of the HEp-2 monolayer and serial dilution of the attached M. catarrhalis 
organisms. When 10 8 CFU of the wild-type and mutant M. catarrhalis strains were suspended 
in PBS, serially diluted, and allowed to stand for 30 min on ice, the viable number of bacteria 
decreased to 10 7 CFU. In contrast, when PBS containing 0.15% (w/v) gelatin (PBS-G) was 
used for this same type of experiment, there was no reduction in the viability of these 
20 M. catarrhalis strains over the duration of the experiment. When the HEp-2 cell-based 
attachment studies were repeated using PBS-G for washing the HEp-2 cell monolayer and as 
the diluent, there was only a three-fold reduction in adherence of the uspAl mutant relative to 
that obtained with the wild-type parent strain. This finding suggested that the original six-fold 
difference in attachment ability observed between the wild-type and uspA I mutant strain may 
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have been attributable in part to viability problems caused by the use of the PBS wash and 
diluent. 

Subsequent studies using Chang conjunctival cells as the target for bacterial attachment 
together with a PBS-G wash and diluent revealed a substantial difference in the attachment 
abilities of the wild-type strain and the uspAl mutant (Table XXXII). Whereas the wild-type 
and uspA2 mutant exhibited similar levels of attachment to the Chang cells, the extent of 
attachment of the uspAl mutant was nearly two orders of magnitude less than that of the wild- 
type parent strain. The uspAl uspA2 double mutant also exhibited a much reduced level of 
attachment similar to obtained with the uspAl mutant (Table XXXII). 

Effect of the uspAl and uspA2 mutations on serum resistance of M catarrhalis. Similar 
to the majority of disease isolates of M caiarrhalis (Hoi el ai, 1993; 1995; Verduin e(ai y 
1994), the wild-type strain 035E was resistant to killing by normal human serum in vitro 
(Helminen et al, 1993 b). To examine the effect of the lack of expression of UspAl or UspA2 
on serum resistance, the wild-type strain and the three mutant strains were tested in a serum 
bactericidal assay. Both the wild-type strain (FIG. 17, closed diamonds) and the uspAl mutant 
035E.1 (FIG. 17, closed triangles) were able to grow in the presence of normal human serum, 
indicating that lack of expression of UspAl did not adversely affect the ability of strain 035E.1 
to resist killing by normal human serum. However, both the uspA2 mutant 035E.2 (FIG. 17, 
closed circles) and the uspAl uspA2 double mutant 035E.12 (FIG. 17, closed squares), having 
in common the lack of expression of UspA2, were readily killed by normal human serum. 
Heat-based inactivation of the complement system present in this normal human serum 
eliminated the ability of this serum to kill these latter two mutants (FIG. 17, open circles and 
squares). 

All of the compositions and methods disclosed and claimed herein can be made and 
executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied to the 
compositions and methods and in the steps or in the sequence of steps of the method described 
herein without departing from the concept, spirit and scope of the invention. More specifically, 
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it will be apparent that certain agents which are both chemically and physiologically related 
may be substituted for the agents described herein while the same or similar results would be 
achieved. All such similar substitutes and modifications apparent to those skilled in the art are 
deemed to be within the spirit, scope and concept of the invention as defined by the appended 
5 claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Board of Regents, The University of Texas 

System 

(B) STREET: 201 W. 7th Street 

(C) CITY: Austin 

(D) STATE: Texas 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 78701 

(ii) TITLE OF INVENTION: UspAl AND UspA2 ANTIGENS OF MORAXELLA 
CATARRHAL IS 

(iii) NUMBER OF SEQUENCES: 98 

<iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/033,598 

(B) FILING DATE: 20-DEC-1996 



(2) INFORMATION FOR SEQ ID NO : 1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 831 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 



Met Asn Lys lie 
1 

Ala Cys Ser Glu 
20 

Ser Leu Leu lie 
3 5 

Gin Ala Thr Asn 
50 

Asn Asn Glu Ala 
65 

Asn Lys Ala Asp 



Tyr Lys Val Lys 
5 

Phe Ala Lys Gly 



Val Gly Ala Leu 
40 

Ser Lys Gly Thr 
55 

Pro Gly Ser Tyr 
70 

Arg Tyr Ser Ala 
85 



Lys Asn Ala Ala 
10 

His Thr Lys Lys 
25 

Gly Met Ala Thr 



Gly Ala His lie 
60 

Ser Phe lie Gly 
75 

lie Gly Gly Gly 
90 



Gly His Leu Val 
15 

Ala Val Leu Gly 
30 

Thr Ala Ser Ala 
45 

Gly Val Asn Asn 



Ser Gly Gly Tyr 
80 

Leu Phe Asn Lys 
95 
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Ala Thr Asn Glu Tyr Ser Thr lie Val Gly Gly Gly Tyr Asn Lys Ala 
100 105 110 

Glu Gly Arg Tyr Ser Thr lie Gly Gly Gly Ser Asn Asn Glu Ala Thr 
115 120 125 

Asn Glu Tyr Ser Thr lie Val Gly Gly Asp Asp Asn Lys Ala Thr Gly 
130 135 140 

Arg Tyr Ser Thr lie Gly Gly Gly Asp Asn Asn Thr Arg Glu Gly Glu 
145 150 155 160 

Tyr Ser Thr Val Ala Gly Gly Lys Asn Asn Gin Ala Thr Gly Thr Gly 
165 170 175 

Ser Phe Ala Ala Gly Val Glu Asn Gin Ala Asn Ala Glu Asn Ala Val 
180 185 190 

Ala Val Gly Lys Lys Asn lie lie Glu Gly Glu Asn Ser Val Ala lie 
195 200 205 

Gly Ser Glu Asn Thr Val Lys Thr Glu His Lys Asn Val Phe lie Leu 
210 215 220 

Gly Ser Gly Thr Thr Gly Val Thr Ser Asn Ser Val Leu Leu Gly Asn 
225 230 235 240 

Glu Thr Ala Gly Lys Gin Ala Thr Thr Val Lys Asn Ala Glu Val Gly 
245 250 255 

Gly Leu Ser Leu Thr Gly Phe Ala Gly Glu Ser Lys Ala Glu Asn Gly 
260 265 270 

Val Val Ser Val Gly Ser Glu Gly Gly Glu Arg Gin lie Val Asn Val 
275 280 285 

Gly Ala Gly Gin lie Ser Asp Thr Ser Thr Asp Ala Val Asn Gly Ser 
290 295 300 

Gin Leu His Ala Leu Ala Thr Val Val Asp Asp Asn Gin Tyr Asp lie 
305 310 315 320 

Val Asn Asn Arg Ala Asp lie Leu Asn Asn Gin Asp Asp lie Lys Asp 
325 330 335 

Leu Gin Lys Glu Val Lys Gly Leu Asp Asn Glu Val Gly Glu Leu Ser 
340 345 350 

Arg Asp lie Asn Ser Leu His Asp Val Thr Asp Asn Gin Gin Asp Asp 
355 360 365 

lie Lys Glu Leu Lys Arg Gly Val Lys Glu Leu Asp Asn Glu Val Gly 
370 375 380 

Val Leu Ser Arg Asp lie Asn Ser Leu His Asp Asp Val Ala Asp Asn 
385 390 395 400 
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Gin Asp Asp lie Ala Lys Asn Lys Ala Asp He Lys Gly Leu Asn Lys 
405 410 415 

Glu Val Lys Glu Leu Asp Lys Glu Val Gly Val Leu Ser Arg Asp He 
420 425 430 

Gly Ser Leu His Asp Asp Val Ala Thr Asn Gin Ala Asp He Ala Lys 
435 440 445 

Asn Gin Ala Asp He Lys Thr Leu Glu Asn Asn Val Glu Glu Glu Leu 
450 455 460 

Leu Asn Leu Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp He Asp Asn 
465 470 475 480 

Asn He Asn Asn He Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser 
485 490 495 

Ser Asp He Lys Thr Leu Lys Asn Asn Val Glu Glu Gly Leu Leu Asp 
500 505 510 

Leu Ser Gly Arg Leu He Asp Gin Lys Ala Asp He Ala Lys Asn Gin 
515 . 520 525 

Ala Asp He Ala Gin Asn Gin Thr Asp He Gin Asp Leu Ala Ala Tyr 
530 535 540 

Asn Glu Leu Gin Asp Gin Tyr Ala Gin Lys Gin Thr Glu Ala He Asp 
545 550 555 560 

Ala Leu Asn Lys Ala Ser Ser Glu Asn Thr Gin Asn He Ala Lys Asn 
565 570 575 

Gin Ala Asp He Ala Asn Asn lie Asn Asn He Tyr Glu Leu Ala Gin 
580 585 590 

Gin Gin Asp Gin His Ser Ser Asp He Lys Thr Leu Ala Lys Val Ser 
595 600 605 

Ala Ala Asn Thr Asp Arg He Ala Lys Asn Lys Ala Glu Ala Asp Ala 
610 615 620 

Ser Phe Glu Thr Leu Thr Lys Asn Gin Asn Thr Leu He Glu Gin Gly 
625 630 635 640 

Glu Ala Leu Val Glu. Gin Asn Lys Ala He Asn Gin Glu Leu Glu Gly 
645 650 655 

Phe Ala Ala His Ala Asp He Gin Asp Lys Gin He Leu Gin Asn Gin 
660 665 670 

Ala Asp lie Thr Thr Asn Lys Thr Ala He Glu Gin Asn He Asn Arg 
675 680 685 

Thr Val Ala Asn Gly Phe Glu He Glu Lys Asn Lys Ala Gly He Ala 
690 695 700 
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Thr Asn Lys Gin Glu Leu He Leu Gin Asn Asp Arg Leu Asn Arg He 
705 710 715 720 

Asn Glu Thr Asn Asn Arg Gin Asp Gin Lys He Asp Gin Leu Gly Tyr 
725 730 735 

Ala Leu Lys Glu Gin Gly Gin His Phe Asn Asn Arg He Ser Ala Val 
740 745 750 

Glu Arg Gin Thr Ala Gly Gly He Ala Asn Ala He Ala He Ala Thr 
755 760 765 

Leu Pro Ser Pro Ser Arg Ala Gly Glu His His Val Leu Phe Gly Ser 
770 775 780 

Gly Tyr His Asn Gly Gin Ala Ala Val Ser Leu Gly Ala Ala Gly Leu 
785 790 795 800 

Ser Asp Thr Gly Lys Ser Thr Tyr Lys He Gly Leu Ser Trp Ser Asp 
805 810 815 

Ala Gly Gly Leu Ser Gly Gly Val Gly Gly Ser Tyr Arg Trp Lys 
820 825 830 

(2) INFORMATION FOR SEQ ID NO : 2: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3349 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2: 

ATCAGCATGT GAGCAAATGA CTGGCGTAAA TGACTGATGA GTGTCTATTT AATGAAAGAT 6 0 

ATCAATATAT AAAAGTTGAC TATAGCGATG CAATACAGTA AAATTTGTTA CGGCTAAACA 12 0 

TAACGACGGT CCAAGATGGC GGATATCGCC ATTTACCAAC CTGATAATCA GTTTGATAGC 18 0 

CATTAGCGAT GGC AT CAAGT TGTGTTGTTG TATTGTCATA TAAACGGTAA ATTTGGTTTG 240 

GTGGATGCCC CATC TG ATTT ACCGTCCCCC TAATAAGTGA GGGGGGGGGG GAGACCCCAG 3 00 

TCATTTATTA GGAGACTAAG ATGAATAAAA TTTATAAAGT GAAGAAAAAT GCCGCAGGTC 3 60 

ACTTGGTGGC ATGTTCTGAA TTTGCCAAAG GTCATACCAA AAAGGCAGTT TTGGGCAGTT 42 0 

TATTGATTGT TGGGGCGTTG GGCATGGCAA CGACGGCGTC TGCACAAGCA ACCAACAGCA 48 0 

AAGGCACAGG CGCGCACATC GGTGTTAACA ATAACAACGA AGCCCCAGGC AGTTACTCTT 54 0 

TCATCGGTAG TGGCGGTTAT AACAAAGCCG ACAGATACTC TGCCATCGGT GGTGGCCTTT 600 

TTAACAAAGC CACAAACGAG TACTCTACCA TCGTTGGTGG CGGTTATAAC AAAGCCGAAG 66 0 
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GCAGATACTC TACCATCGGT GGTGGCAGTA ACAACGAAGC CACAAACGAG TACTCTACCA 72 0 

TCGTTGGTGG CGATGACAAC AAAGCCACAG GCAGATACTC TACCATCGGT GGTGGCGATA 78 0 

5 

ACAACACACG CGAAGGCGAA TACTCAACCG TCGCAGGGGG CAAGAATAAC CAAGCCACAG 84 0 

GTACAGGTTC ATTTGCCGCA GGTGTAGAGA ACCAAGCCAA TGCCGAAAAC GCCGTCGCCG 900 

10 TGGGTAAAAA GAACATTATC GAAGGTGAAA ACTCAGTAGC CATCGGCTCT GAGAATACCG 96 0 

TTAAAACAGA ACACAAAAAT GTCTTTATTC TTGGCTCTGG CACAACAGGT GTAACGAGTA 102 0 

ACTCAGTGCT ACTGGGTAAT GAGACCGCTG GCAAACAGGC GACCACTGTT AAGAATGCCG 108 0 

15 

AAGTGGGTGG TCTAAGCCTA ACAGGATTTG CAGGGGAGTC AAAAGCTGAA AACGGCGTAG 114 0 

TTTCTGTGGG TAGTGAAGGC GGTGAGCGTC AAATCGTTAA TGTTGGTGCA GGTCAGATCA 1200 

20 GTGACACCTC AACAGATGCT GTTAATGGCT CACAGCTACA TGCTTTGGCC ACAGTTGTTG 126 0 

ATGACAACCA ATATGACATT GTTAACAACC GAGCTGACAT TCTTAACAAC CAAGATGATA 13 2 0 

TCAAAGATCT TCAGAAGGAG GTGAAAGGTC TTGATAATGA GGTGGGTGAA TTAAGCCGAG 13 8 0 

25 

ACATTAATTC ACTTCATGAT GTTACTGACA ACCAACAAGA TGACATCAAA, GAGCTTAAGA 14 4 0 

GGGGGGTAAA AGAGCTTGAT AATGAGGTGG GTGTATTAAG CCGAGACATT AATTCACTTC 15 0 0 

30 ATGATGATGT TGCTGACAAC CAAGATGACA TTGCTAAAAA CAAAGCTGAC ATCAAAGGTC 156 0 

TTAATAAGGA GGTGAAAGAG CTTGATAAGG AGGTGGGTGT ATTAAGCCGA GACATTGGTT 16 2 0 

CACTTCATGA TGATGTTGCC ACCAACCAAG CTGACATTGC TAAAAACCAA GCGGATATCA 16 8 0 

35 

AAACACTTGA AAACAATGTC GAAGAAGAAT TATTAAATCT AAGCGGTCGC CTGCTTGATC 1740 

AGAAAGCGGA TATTGATAAT AACATCAACA ATATCTATGA GCTGGCACAA C AG C AAGATC 180 0 

40 AG CAT AG C TC TGATATCAAA ACACTTAAAA ACAATGTCGA AGAAGGTTTA TTGGATCTAA 18 6 0 

GCGGTCGCCT CATTGATCAA AAAG C AG AT A TTGCTAAAAA CCAAGCTGAC ATTGCTCAAA 1920 

ACCAAACAGA CATCCAAGAT CTGGCCGCTT ACAATGAGCT ACAAGACCAG TATGCTCAAA 198 0 

45 

AGCAAACCGA AGCGATTGAC GCTCTAAATA AAGCAAGCTC TGAGAATACA CAAAACATTG 2 04 0 

CTAAAAACCA AGCGGATATT GCTAATAACA TCAACAATAT CTATGAGCTG G C AC AAC AG C 2100 

50 AAGATCAGCA TAGCTCTGAT ATCAAAACCT TGGCAAAAGT AAGTGCTGCC AATACTGATC 216 0 

GTATTGCTAA AAACAAAGCT GAAGCTGATG CAAGTTTTGA AACGCTCACC AAAAATCAAA 22 2 0 

ATACTTTGAT TGAGCAAGGT GAAGCATTGG TTGAGCAAAA T AAAG C CAT C AATCAAGAGC 228 0 

55 

TTGAAGGGTT TGCGGCTCAT GCAGATATTC AAGATAAGCA AATTTTACAA AACCAAGCTG 234 0 
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ATATCACTAC CAATAAGACC GCTATTGAAC AAAATATCAA TAGAACTGTT GCCAATGGGT 2 4 00 

TTGAGATTGA GAAAAATAAA GCTGGTATTG CTACCAATAA GCAAGAGCTT ATTCTTCAAA 2460 

ATGATCGATT AAATCGAATT AATGAGACAA ATAATCGTCA GGATCAGAAG ATTGATCAAT 2 520 

TAGGTTATGC ACTAAAAGAG CAGGGTCAGC ATTTTAATAA TCGTATTAGT GCTGTTGAGC 2 580 

GTCAAACAGC TGGAGGTATT GCAAATGCTA TCGCAATTGC AACTTTAC C A TCGCCCAGTA 2 64 0 

GAGCAGGTGA GCATCATGTC TTATTTGGTT CAGGTTATCA CAATGGTCAA GCTGCGGTAT 2 700 

CATTGGGCGC GGCTGGGTTA AGTGATACAG GAAAATCAAC TTATAAGATT GGTCTAAGCT 2 76 0 

GGTCAGATGC AGGTGGATTA TCTGGTGGTG TTGGTGGCAG TTACCGCTGG AAATAAAGCC 28 2 0 

TAAATTTAAC TGCTGTGTCA AAAAATATGG TCTGTATAAA C AG AC CAT AT TTTTATC CAA 2 880 

AAAAATTATC TTAACTTTTA TAAAGTATTA TAAGCCAAAG CTGTAATAAT AAGAGATGTT 2 94 0 

GAAATAAGAG ATGTTAAAGC TGCTAGACAA TCGGCTTGCG ACGATAAAAT AAGATACCTG 3 000 

GAATGGACAG CCCCAAAACC AATGCTGAGA TGATAAAAAT CGCCTCAAAA AAATG AC G C A 3 06 0 

TCATAACGAT AAATAAATCC ATATCAAATC CAAAATAGCC AATTTGTACC ATGCTAACCA 312 0 

TGGCTTTATA GGCAGCGATT CCCGGCATCA TACAAATCAA GCTAGGTACA ATCAAGGCTT 318 0 

TAGGTGGCAG GCCATGACGC TGAGCAAAAT GTACACCCAA AAAGCTACCC GCCATCGCCC 3 24 0 

CAAAGAATGT TGCCACAACC AAATG C AC AC CAAAAATTAC CATCACTTGT TTTAAACCAA 3 3 00 

AACCAAGTGG TGTTACCATC ATGCAATGCA TGATGTATTG CTTTGTCAA 3 34 9 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

Met Lys Leu Leu Pro Leu Lys lie Ala Val Thr Ser Ala Met lie Val 
1 5 10 15 

Gly Leu Gly Ala Thr Ser Thr Val Asn Ala Gin Val Val Glu Gin Phe 
20 25 30 

Phe Pro Asn He Phe Phe Asn Glu Asn His Asp Glu Leu Asp Asp Ala 
35 40 45 

Tyr His Asn Met He Leu Gly Asp Thr Ala He Val Ser Asn Ser Gin 
50 55 60 
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Asp Asn Ser Thr Gin Leu Lys Phe Tyr Ser Asn Asp Glu Asp Ser Val 
65 70 75 80 

Pro Asp Ser Leu Leu Phe Ser Lys Leu Leu His Glu Gin Gin Leu Asn 
85 90 95 

Gly Phe Lys Ala Gly Asp Thr lie lie Pro Leu Asp Lys Asp Gly Lys 
100 105 110 

Pro Val Tyr Thr Lys Asp Thr Arg Thr Lys Asp Gly Lys Val Glu Thr 
115 120 125 

Val Tyr Ser Val Thr Thr Lys lie Ala Thr Gin Asp Asp Val Glu Gin 
130 135 140 

Ser Ala Tyr Ser Arg Gly lie Gin Gly Asp lie Asp Asp Leu Tyr Asp 
145 150 155 160 

lie Asn Arg Glu Val Asn Glu Tyr Leu Lys Ala Thr His Asp Tyr Asn 
165 170 175 

Glu Arg Gin Thr Glu Ala He Asp Ala Leu Asn Lys Ala Ser Ser Ala 
180 185 190 

Asn Thr Asp Arg He Asp Thr Ala Glu Glu Arg He Asp Lys Asn Glu 
195 200 205 

Tyr Asp He Lys Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Glu 
210 215 220 

Leu Ser Gly His Leu He Asp Gin Lys Ala Asp Leu Thr Lys Asp He 
225 230 235 240 

Lys Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Glu Leu Ser Gly 
245 250 255 

His Leu He Asp Gin Lys Ala Asp Leu Thr Lys Asp He Lys Ala Leu 
260 265 270 

Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu Leu 
275 280 285 

Asp Gin Lys Ala Asp He Ala Lys Asn Gin Ala Asp" He Ala Gin Asn 
290 295 300 

Gin Thr Asp lie Gin Asp Leu Ala Ala Tyr Asn Glu Leu Gin Asp Ala 
305 310 315 320 

Tyr Ala Lys Gin Gin Thr Glu Ala He Asp Ala Leu Asn Lys Ala Ser 
325 330 335 

Ser Glu Asn Thr Gin Asn He Ala Lys A-n Gin Ala Asp He Ala Asn 
340 345 350 

Asn He Asn Asn He Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser 
355 360 365 
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Ser Asp lie Lys Thr Leu Ala Lys Ala Ser Ala Ala Asn Thr Asp Arg 
370 375 380 

lie Ala Lys Asn Lys Ala Asp Ala Asp Ala Ser Phe Glu Thr Leu Thr 
385 390 395 400 

Lys Asn Gin Asn Thr Leu lie Glu Lys Asp Lys Glu His Asp Lys Leu 
405 410 415 

lie Thr Ala Asn Lys Thr Ala He Asp Ala Asn Lys Ala Ser Ala Asp 
420 425 430 

Thr Lys Phe Ala Ala Thr Ala Asp Ala He Thr Lys Asn Gly Asn Ala 
435 440 445 

He Thr Lys Asn Ala Lys Ser He Thr Asp Leu Gly Thr Lys Val Asp 
450 455 460 

Gly Phe Asp Gly Arg Val Thr Ala Leu Asp Thr Lys Val Asn Ala Leu 
465 470 475 480 

Asp Thr Lys Val Asn Ala Phe Asp Gly Arg He Thr Ala Leu Asp Ser 
485 490 495 

Lys Val Glu Asn Gly Met Ala Ala Gin Ala Ala Leu Ser Gly Leu Phe 
500 505 510 

Gin Pro Tyr Ser Val Gly Lys Phe Asn Ala Thr Ala Ala Leu Gly Gly 
515 520 525 

Tyr Gly Ser Lys Ser Ala Val Ala He Gly Ala Gly Tyr Arg Val Asn 
530 535 540 

Pro Asn Leu Ala Phe Lys Ala Gly Ala Ala He Asn Thr Ser Gly Asn 
545 550 555 560 

Lys Lys Gly Ser Tyr Asn He Gly Val Asn Tyr Glu Phe 
565 570 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2596 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: 

CTGGTGGTCG CAGGGGGCGT CTCTGCCAAT CAGTACACTA CGCCGCACCC TGACCGAAAC 60 

GCTCCGCCAA ATCGATGCGT CGGTGTACCA TGCCCCGACC GAGCTATGCA CGGATAATGG 120 

TGCGATGATC GCCTATGCTG GCTTTTGTCG GCTAATCCGT GGACAGTCGG ATGACTTGGT 18 0 

GGTTCGCTGC ATTCCCCGAT GGGATATGAC GACGCTTGGC GTATCTGCTC ATAAATAGCC 240 
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ACATCAATCA TACCAACCAA ATCATACCAA CCAAATCGTA CAAACGGTTG ATACATGCCA 300 

AAAATACCAT ATTGAAAGTA GGGTTTGGGT ATTATTTATG TAACTTATAT CTAATTTGGT 360 

5 

GTTGATACTT TGATAAAGCC TTGCTATACT GTAACCTAAA TGGATATGAT AGAGATTTTT 420 

CCATTTATGC CAGCAAAAGA GATAGATAGA TAGATAGATA GATAGATAGA TAGATAGATA 4 80 

10 GATAGATAGA TAGATAGATA AAACTCTGTC TTTTATCTGT CCGCTGATGC TTTCTGCCTG 54 0 

CCACCGATGA TATCATTTAT CTGCTTTTTA GGCATCAGTT ATTTCACCGT GATGACTGAT 6 00 

GTGATGACTT AACTACCAAA AGAGAGTGCT AAATGAAAAC CATGAAACTT CTCCCCCTAA 66 0 

15 

AAATCGCTGT AACCAGTGCC ATGATTGTTG GCTTGGGTGC GACATCTACT GTGAATGCAC 72 0 

AAGTAGTGGA ACAGTTTTTT CCGAATATCT TTTTTAATGA AAACCATGAT GAATTAGATG 78 0 

20 ATGCATACCA TAATATGATC TTAGGGGATA CTGCGATTGT ATCTAATTCA CAAGATAATA 84 0 

GTACTCAATT GAAATTTTAT TCTAATGATG AAGATTCAGT TCCTGACAGC CTACTCTTTA 90 0 

GTAAACTACT TCATGAGCAG CAACTTAATG GTTTTAAAGC AGGTGACACA ATCATTCCTT 96 0 

25 

TGGATAAGGA TGGCAAACCT GTTTATACAA AGGACACGAG AACAAAGGAT GGTAAAGTAG 102 0 

AAACAGTTTA TTCGGTCACC ACCAAAATCG CTACCCAAGA TGATGTTGAA CAAAGTGCAT 108 0 

30 ATTCACGAGG CATTCAAGGT GATATCGATG ATCTGTATGA CATTAACCGT GAAGTCAATG 114 0 

AATACTTAAA AGCAACACAT GATTATAATG AAAGACAAAC TGAAGCAATT GACGCTCTAA 12 00 

ACAAAGCAAG CTCTGCGAAT ACTGATCGTA TTGATACTGC TGAAGAGCGT ATCGATAAAA 12 6 0 

35 

ACGAATATGA CATTAAAGCA CTTGAAAGCA ATGTCGAAGA AGGTTTGTTG GAGCTAAGCG 132 0 

GTCACCTCAT TGATCAAAAA GCAGATCTTA CAAAAGACAT CAAAGCACTT GAAAGCAATG .13 8 0 

40 TCGAAGAAGG TTTGTTGGAG CTAAGCGGTC ACCTCATTGA TCAAAAAGCA GATCTTACAA 144 0 

AAGACATCAA AGCACTTGAA AGCAATGTCG AAGAAGGTTT GTTGGATCTA AGCGGTCGTC 1500 

TGCTTGATCA AAAAG C AG AT ATCGCTAAAA ACCAAGCTGA CATTGCTCAA AACCAAACAG 156 0 

45 

ACATCCAAGA TCTAGCCGCT TACAACGAGC TACAAGATGC CTATGCCAAA CAGCAAACCG 162 0 

AAGCGATTGA CGCTCTAAAC AAAGCAAGCT CTGAGAATAC ACAAAACATT GCTAAAAACC. 16 8 0 

50 AAGCGGATAT TGCTAATAAC ATCAACAATA TCTATGAGCT GGCACAACAG CAAGATCAGC 174 0 

ATAGCTCTGA TATCAAAACC TTGGC AAAAG CAAGTGCTGC CAATACTGAT CGTATTGCTA 18 00 

AAAACAAAGC CGATGCTGAT GCAAGTTTTG AAACGCTCAC CAAAAATCAA AATACTTTGA 186 0 

55 

TTGAAAAAGA TAAAGAGCAT GACAAATTAA TTACTGCAAA CAAAACTGCG ATTGATGCCA 192 0 
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ATAAAGCATC TGCGGATACC AAGTTTGCAG C G AC AG C AG A CGCCATTACC AAAAATGGAA 1980 

ATGCTATCAC TAAAAACGCA AAATCTATCA CTGATTTGGG TACTAAAGTG GATGGTTTTG 2 04 0 

ACGGTCGTGT AAC TG C ATT A GACACCAAAG TCAATGCCTT AG AC AC C AAA GTCAATGCCT 2100 

TTGATGGTCG TATCACAGCT TT AG AC AG T A AAGTTGAAAA CGGTATGGCT GCCCAAGCTG 216 0 

CCCTAAGTGG T CT ATT C C AG CCTTATAGCG TTGGTAAGTT TAATGCGACC GCTGCACTTG 2220 

GTGGCTATGG CTCAAAATCT GCGGTTGCTA TCGGTGCTGG CTATCGTGTG AATCCAAATC 22 8 0 

TGGCGTTTAA AGCTGGTGCG GCGATTAATA CCAGTGGTAA TAAAAAAGGC TCTTATAACA 234 0 

15 TCGGTGTGAA TTACGAGTTT TAATTGTCTA TCATCACCAA AAAAAAGCAG TCAGTTTACT 24 00 

GGCTGCTTTT TTATGGGTTT TTGTGGCTTT TGGTTGTGAG TGATGGATAA AAGCTTATCA 246 0 

AGCGATTGAT GAATATCAAT AAATGATTGG TAAATATCAA TAAAGCGGTT TAGGGTTTTT 2 52 0 

GGATATCTTT TAATAAGTTT AAAAACCCCT GCATAAAATA AAGCTGGGCA TCAGAGCTGC 2 58 0 

GAGTAGCGGC ATACAG 2 5 96 



(2) INFORMATION FOR SEQ ID NO : 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 92 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Asn Lys lie Tyr Lys Val Lys Lys Asn Ala Ala Gly His Leu Val 
1 5 10 15 

Ala Cys Ser Glu Phe Ala Lys Gly His Thr Lys Lys Ala Val Leu Gly 
20 25 30 

Ser Leu Leu lie Val Gly Ala Leu Gly Met Ala Thr Thr Ala Ser Ala 
35 40 45 

Gin Ala Thr Lys Gly Thr Gly Lys His Val Val Asp Asn Lys Asp Asn 
50 55 60 

Lys Ala Lys Gly Asp Tyr Ser Thr Ala Ser Gly Gly Lys Asp Asn Glu 
65 .70 75 80 

Ala Lys Gly Asn Tyr Ser Thr Val Gly Gly Gly Asp Tyr Asn Glu Ala 
85 90 95 



Lys Gly Asn Tyr Ser Thr Val Gly Gly Gly Ser Ser Asn Thr Ala Lys 
55 100 105 110 
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Gly Glu Lys Ser Thr lie Gly Gly Gly Asp Thr Asn Asp Ala Asn Gly 
115 120 125 

Thr Tyr Ser Thr He Gly Gly Gly Tyr Tyr Ser Arg Ala He Gly Asp 
130 135 140 

Ser Ser Thr He Gly Gly Gly Tyr Tyr Asn Gin Ala Thr Gly Glu Lys 
145 150 155 160 

Ser Thr Val Ala Gly Gly Arg Asn Asn Gin Ala Thr Gly Asn Asn Ser 
165 170 175 

Thr Val Ala Gly Gly Ser Tyr Asn Gin Ala Thr Gly Asn Asn Ser Thr 
180 185 190 

Val Ala Gly Gly Ser His Asn Gin Ala Thr Gly Glu Gly Ser Phe Ala 
195 200 205 

Ala Gly Val Glu Asn Lys Ala Asn Ala Asn Asn Ala Val Ala Leu Gly 
210 215 220 

Lys Asn Asn Thr He Asp Gly Asp Asn Ser Val Ala He Gly Ser Asn 
225 230 235 240 

Asn Thr He Asp Ser Gly Lys Gin Asn Val Phe He Leu Gly Ser Ser 
245 250 255 

Thr Asn Thr Thr Asn Ala Gin Ser Gly Ser Val Leu Leu Gly His Asn 
260 265 270 

Thr Ala Gly Lys Lys Ala Thr Ala Val Ser Ser Ala Lys Val Asn Gly 
275 280 285 

Leu Thr Leu Gly Asn Phe Ala Gly Ala Ser Lys Thr Gly Asn Gly Thr 
290 295 300 

Val Ser Val Gly Ser Glu Asn Asn Glu Arg Gin He Val Asn Val Gly 
305 310 315 320 

Ala Gly Asn He Ser Ala Asp Ser Thr Asp Ala Val Asn Gly Ser Gin 
325 330 335 

Leu Tyr Ala Leu Ala Thr Ala Val Lys Ala Asp Ala Asp Glu Asn Phe 
340 345 350 

Lys Ala Leu Thr Lys Thr Gin Asn Thr Leu He Glu Gin Gly Glu Ala 
355 360 365 

Gin Asp Ala Leu He Ala Gin Asn Gin Thr Asp He Thr Ala Asn Lys 
370 375 380 

Thr Ala He Glu Arg Asn Phe Asn Arg Thr Val Val Asn Gly Phe Glu 
385 390 395 400 

He Glu Lys Asn Lys Ala Gly He Ala Lys Asn Gin Ala Asp He Gin 
405 410 415 
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Thr Leu Glu Asn Asn Val Gly Glu Glu Leu Leu Asn Leu Ser Gly Arg 
420 425 430 

Leu Leu Asp Gin Lys Ala Asp lie Asp Asn Asn lie Asn Asn He Tyr 
435 440 445 

Asp Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp He Lys Thr Leu 
450 455 460 

Lys Lys Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu He 
465 470 475 480 

Asp Gin Lys Ala Asp Leu Thr Lys Asp He Lys Thr Leu Glu Asn Asn 
485 490 495 

Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu He Asp Gin Lys 
500 505 510 

Ala Asp He Ala Lys Asn Gin Ala Asp He Ala Gin Asn Gin Thr Asp 
515 520 525 

He Gin Asp Leu Ala Ala Tyr Asn Glu Leu Gin Asp Gin Tyr Ala Gin 
530 535 540 

Lys Gin Thr Glu Ala He Asp Ala Leu Asn Lys Ala Ser Ser Ala Asn 
545 550 555 560 

Thr Asp Arg He Ala Thr Ala Glu Leu Gly He Ala Glu Asn Lys Lys 
565 ' 570 575 

Asp Ala Gin He Ala Lys Ala Gin Ala Asn Glu Asn Lys Asp Gly He 
580 585 590 

Ala Lys Asn Gin Ala Asp He Gin Leu His Asp Lys Lys He Thr Asn 
595 600 605 

Leu Gly He Leu His Ser Met Val Ala Arg Ala Val Gly Asn Asn Thr 
610 615 620 

Gin Gly Val Ala Thr Asn Lys Ala Asp He Ala Lys Asn Gin Ala Asp 
625 630 635 640 

He Ala Asn Asn He Lys Asn He Tyr Glu Leu Ala Gin Gin Gin Asp 
645 650 ' 655 

Gin His Ser Ser Asp He Lys Thr Leu Ala Lys Val Ser Ala Ala Asn 
660 665 670 

Thr Asp Arg He Ala Lys Asn Lys Ala Glu Ala Asp Ala Ser Phe Glu 
675 680 685 

Thr Leu Thr Lys Asn Gin Asn Thr Leu He Glu Gin Gly Glu Ala Leu 
690 695 700 

Val Glu Gin Asn Lys Ala He Asn Gin Glu Leu Glu Gly Phe Ala Ala 
705 710 715 720 
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His Ala Asp Val Gin Asp Lys Gin lie Leu Gin Asn Gin Ala Asp He 
725 730 735 

Thr Thr Asn Lys Ala Ala He Glu Gin Asn He Asn Arg Thr Val Ala 

5 740 745 750 

Asn Gly Phe Glu He Glu Lys Asn Lys Ala Gly lie Ala Thr Asn Lys 
755 760 765 

10 Gin Glu Leu He Leu Gin Asn Asp Arg Leu Asn Gin He Asn Glu Thr 

770 775 780 



15 



30 



Asn Asn Arg Gin Asp Gin Lys He Asp Gin Leu Gly Tyr Ala Leu Lys 

785 790 795 800 

Glu Gin Gly Gin His Phe Asn Asn Arg He Ser Ala Val Glu Arg Gin 

805 810 815 



Thr Ala Gly Gly He Ala Asn Ala He Ala He Ala Thr Leu Pro Ser 
20 820 825 830 

Pro Ser Arg Ala Gly Glu His His Val Leu Phe Gly Ser Gly Tyr His 
835 840 845 

25 Asn Gly Gin Ala Ala Val Ser Leu Gly Ala Ala Gly Leu Ser Asp Thr 

850 855 860 



Gly Lys Ser Thr Tyr Lys He Gly Leu Ser Trp Ser Asp Ala Gly Gly 

865 870 875 880 

Leu Ser Gly Gly Val Gly Gly Ser Tyr Arg Trp Lys 
885 890 



35 (2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3381 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
45 TGTGAGCAAA TGACTGGCGT AAATGACTGA TGAATGTCTA TTTAATGAAA GATATCAATA 60 

TATAAAAGTT GACTATAGCG ATGCAATACA GTAAAATTTG TTACGGCTAA ACATAACGAC 120 

GGTCCAAGAT GGCGGATATC GCCATTTACC AACCTGATAA TCAGTTTGAT AGCCATTAGC 18 0 

50 

GATGGCATCA AGTTGTGTTG TTGTATTGTC ATATAAACGG TAAATTTGGT TTGGTGGATG 240 

CCCCATCTGA TTTACCGTCC CCCTAATAAG TGAGGGGGGG GGAGACCCCA GTCATTTATT 3 00 

55 AGGAGACTAA GATGAACAAA ATTTATAAAG TGAAAAAAAA TGCCGCAGGT CACTTGGTGG 360 

CATGTTCTGA ATTTGCCAAA GGCCATACCA AAAAGGCAGT TTTGGGCAGT TTATTGATTG 420 
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TTGGGGCATT GGGCATGGCA ACGACGGCGT CTGCACAAGC AACCAAAGGC ACAGGCAAGC 4 80 

ACGTTGTTGA CAATAAGGAC AACAAAGCCA AAGGCGATTA CTCTACCGCC AGTGGTGGCA 54 0 

AGGACAACGA AGCCAAAGGC AATTACTCTA CCGTCGGTGG TGGCGATTAT AACGAAGCCA 600 

AAGGCAATTA CTCTACCGTC GGTGGTGGCT CTAGTAATAC CGCCAAAGGC GAGAAATCAA 660 

10 CCATCGGTGG TGGCGATACT AACGACGCCA ACGGCACATA CTCTACCATC GGTGGTGGCT 72 0 

ATTATAGCCG AGCCATAGGC GATAGCTCTA CCATCGGTGG TG G TT ATT AT AACCAAGCCA 78 0 

CAGGCGAGAA ATCAACGGTT GCAGGGGGCA GGAATAACCA AGCCACAGGC AACAACTCAA 840 

CGGTTGCAGG CGGCTCTTAT AACCAAGCCA CAGGCAACAA CTCAACGGTT GCAGGTGGCT 900 

CTCATAACCA AGCCACAGGT GAAGGTTCAT TTGCAGCAGG TG T AG AG AAC AAAGCCAATG 96 0 

20 CCAACAACGC CGTCGCTCTA GGTAAAAATA AC AC CATCG A TG G CG AT AAC TCAGTAGCCA 1020 

TCGGCTCTAA TAATACCATT GACAGTGGCA AACAAAATGT CTTTATTCTT GGCTCTAGCA 10 8 0 

CAAACACAAC AAATGCACAA AGCGGCTCCG TGCTGCTGGG TCATAATACC GCTGGCAAAA 114 0 

25 

AAGCAACCGC TGTTAGCAGT GCCAAAGTGA ACGGCTTAAC CCTAGGAAAT TTTGCAGGTG 1200 

CATCAAAAAC TGGTAATGGT ACTGTATCTG TCGGTAGTGA GAATAATGAG CGTCAAATCG 1260 

30 TCAATGTTGG TGCAGGTAAT ATCAGTGCTG ATTCAACAGA TGCTGTTAAT GGCTCACAGC 132 0 

TATATGCTTT GGCCACAGCT GTCAAAGCCG ATGCCGATGA AAACTTTAAA GCACTCACCA 13 8 0 

AAACTCAAAA TACTTTGATT GAG C AAGGTG AAGCACAAGA CGCATTAATC GCTCAAAATC 144 0 

35 

AAAC TG AC AT CACTGCCAAT AAAACTGCCA TTG AG CG AAA TTTTAATAGA ACTGTTGTCA 15 00 

ATGGGTTTGA GATTGAGAAA AATAAAGCTG GTATTGCTAA AAACCAAGCG GATATCCAAA 1560 

40 CGCTTGAAAA CAATGTCGGA GAAGAACTAT TAAATCTAAG CGGTCGCCTG CTTGATCAAA 16 2 0 

AAGCGGATAT TGATAATAAC AT C AAC AAT A TCTATGATCT GGCACAACAG CAAGATCAGC 16 8 0 

ATAGCTCTGA TATCAAAACA CTTAAAAAAA ATGTCGAAGA AGGTTTGTTG GATCTAAGTG 174 0 

45 

GTCGCCTCAT TGATCAAAAA GCAGATCTTA CGAAAGACAT C AAAAC AC TT GAAAACAATG 18 0 0 

TCGAAGAAGG TTTGTTGGAT CTAAGCGGTC GCCTCATTGA TCAAAAAGCA GATATTGCTA 18 6 0 

50 AAAACCAAGC TGACATTGCT CAAAACCAAA CAGACATCCA AGATCTGGCC GCTTACAACG 192 0 

AG CT AC AAG A CCAGTATGCT CAAAAGCAAA CCGAAGCGAT TGACGCTCTA AATAAAGCAA 198 0 

GCTCTGCCAA TACTGATCGT ATTGCTACTG CTGAATTGGG TATCGCTGAG AACAAAAAAG 2 04 0 

ACGCTCAGAT CGCCAAAGCA CAAGCCAATG AAAATAAAGA CGGCATTGCT AAAAACCAAG 2100 



55 
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CTGATATCCA GTTGCACGAT AAAAAAATCA CCAATCTAGG TATCCTTCAC AGCATGGTTG 216 0 

CAAGAGCGGT AGGAAATAAC ACACAAGGTG TTGCTACCAA TAAAGCTGAC ATTGCTAAAA 2 22 0 

ACCAAGCAGA TATTGCTAAT AACATCAAAA ATATCTATGA GCTGGCACAA CAGCAAGATC 228 0 

AGCATAGCTC TGATATCAAA ACCTTGGCAA AAGTAAGTGC TGCCAATACT GATCGTATTG 2 34 0 

CTAAAAACAA AGCTGAAGCT GATGCAAGTT TTGAAACGCT CACCAAAAAT CAAAATACTT 24 00 

TGATTGAGCA AGGTGAAGCA TTGGTTGAGC AAAATAAAGC CATCAATCAA GAGCTTGAAG 246 0 

GGTTTGCGGC TCATGCAGAT GTTCAAGATA AG C AAATTTT ACAAAACCAA GCTGATATCA 2 52 0 

CTACCAATAA GGCCGCTATT GAACAAAATA TCAATAGAAC TGTTGCCAAT GGGTTTGAGA 2 580 

TTGAGAAAAA TAAAGCTGGT ATTGCTACCA ATAAGCAAGA GCTTATTCTT CAAAATGATC 264 0 

GATTAAATCA AATTAATGAG ACAAATAATC GTCAGGATCA GAAGATTGAT CAATTAGGTT 2 700 

ATGCACTAAA AGAGCAGGGT CAGCATTTTA ATAATCGTAT TAGTGCTGTT GAGCGTCAAA 2 76 0 

CAGCTGGAGG TATTGCAAAT GCTATCGCAA TTGCAACTTT ACCATCGCCC AGTAGAGCAG 2 82 0 

GTGAGCATCA TGTCTTATTT GGTTCAGGTT ATCACAATGG TCAAGCTGCG GTATCATTGG 2880 

GTGCGGCTGG GTTAAGTGAT ACAGGAAAAT CAACTTATAA GATTGGTCTA AGCTGGTCAG 2 94 0 

ATGCAGGTGG ATTATCTGGT GGTGTTGGTG GCAGTTACCG CTGGAAATAG AGCCTAAATT '3 000 

TAACTGCTGT ATCAAAAAAT ATGGTCTGTA T AAAC AG AC C ATATTTTTAT CTAAAAACTT 3 06 0 

ATCTTAACTT TTATGAAGCA TCATAAGCCA AAGCTGAGTA ATAATAAGAG ATGTTAAAAT 312 0 

AAGAGATGTT AAAACTGCTA AACAATCGGC TTACGACGAT AAAATAAAAT ACCTGGAATG 3180 

GACAGCCCCA AAACCAATGC TGAGATGATA AAAATCGCCT CAAAAAAATG ACGCATCATA 3 24 0 

ACGATAAATA AATCCATATC AAATCCAAAA TAGCCAATTT GTACCATGCT AACCATGGCT 3 3 00 

TTATAGGCAG CGATTCCCGG CAT C AT AC AA ATCAAGCTAG GTACAATCAA GGCTTTAGGC 3 360 

GGCAGGCCAT GACGCTGAGC A 3381 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 624 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Val Asn Lys lie Tyr Lys Val Lys Lys Asn Ala Ala Gly His Ser Val 
1 5 10 15 
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Ala Cys Ser Glu Phe Ala Lys Gly His Thr Lys Lys Ala Val Leu Gly 
20 25 30 

Ser Leu Leu lie Val Gly Ala Leu Gly Met Ala Thr Thr Ala Ser Ala 
35 40 45 

Gin Thr Gly Ser Thr Asn Ala Ala Asn Gly Asn He He Ser Gly Val 
50 55 60 

Gly Ala Tyr Val Gly Gly Gly Val He Asn Gin Ala Lys Gly Asn Tyr 
65 70 75 80 

Pro Thr Val Gly Gly Gly Phe Asp Asn Arg Ala Thr Gly Asn Tyr Ser 
85 90 95 

Val lie Ser Gly Gly Phe Asp Asn Gin Ala Lys Gly Glu His Ser Thr 
100 105 HO 

lie Ala Gly Gly Glu Ser Asn Gin Ala Thr Gly Arg Asn Ser Thr Val 
115 120 125 

Ala Gly Gly Ser Asn Asn Gin Ala Val Gly Thr Asn Ser Thr Val Ala 
130 135 140 

Gly Gly Ser Asn Asn Gin Ala Lys Gly Ala Asn Ser Phe Ala Ala Gly 
145 150 155 160 

Val Gly Asn Gin Ala Asn Thr Asp Asn Ala Val Ala Leu Gly Lys Asn 
165 170 175 

Asn Thr He Asn Gly Asn Asn Ser Ala Ala He Gly Ser Glu Asn Thr 
180 185 190 

Val Asn Glu Asn Gin Lys Asn Val Phe He Leu Gly Ser Asn Thr Thr 
195 200 205 

Asn Ala Gin Ser Gly Ser Val Leu Leu Gly His Glu Thr Ser Gly Lys 
210 215 220 

Glu Ala Thr Ala Val Ser Arg Ala Arg Val Asn Gly Leu Thr Leu Lys 
225 230 235 240 

Asn Phe Ser Gly Val Ser Lys Ala Asp Asn Gly Thr Val Ser Val Gly 
245 250 255 

Ser Gin Gly Lys Glu Arg Gin He Val His Val Gly Ala Gly Gin He 
260 265 270 

Ser Asp Asp Ser Thr Asp Ala Val Asn Gly Ser Gin Leu Tyr Ala Leu 
275 280 285 

Ala Thr Ala Val Asp Asp Asn Gin Tyr Asp He Glu lie Asn Gin Asp 
290 295 300 

Asn He Lys Asp Leu Gin Lys Glu Val Lys Gly Leu Asp Lys Glii Val 
305 310 315 320 
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Gly val Leu Ser Arg Asp lie Gly Ser Leu His Asp Asp Val Ala Asp 
325 330 335 

Asn Gin Ala Asp lie Ala Lys Asn Lys Ala Asp lie Lys Glu Leu Asp 
340 345 350 

Lys Glu Met Asn Val Leu Ser Arg Asp lie Val Ser Leu Asn Asp Asp 
355 360 365 

Val Ala Asp Asn Gin Ala Asp lie Ala Lys Asn Gin Ala Asp He Lys 
370 375 380 

Thr Leu Glu Asn Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg 
385 390 395 400 

Leu He Asp Gin Lys Ala Asp He Asp Asn Asn He Asn His He Tyr 
405 410 415 

Glu Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp He Lys Thr Leu 
420 425 430 

Ala Lys Ala Ser Ala Ala Asn Thr Asp Arg He Ala Lys Asn Lys Ala 
435 440 445 

Asp Ala Asp Ala Ser Phe Glu Thr Leu Thr Lys Asn Gin Asn Thr Leu 
450 455 460 

He Glu Lys Asp Lys Glu His Asp Lys Leu He Thr Ala Asn Lys Thr 
465 470 475 480 

Ala He Asp Ala Asn Lys Ala Ser Ala Asp Thr Lys Phe Ala Ala Thr 
485 490 495 

Ala Asp Ala He Thr Lys Asn Gly Asn Ala He Thr Lys Asn Ala Lys 
500 505 510 

Ser He Thr Asp Leu Gly Thr Lys Val Asp Gly Phe Asp Gly Arg Val 
515 520 525. 

Thr Ala Leu Asp Thr Lys Val Asn Ala Phe Asp Gly Arg He Thr Ala 
530 535 540 

Leu Asp Ser Lys Val Glu Asn Gly Met Ala Ala Gin Ala Ala Leu Ser 
545 550 555 560 

Gly Leu Phe Gin Pro Tyr Ser Val Gly Lys Phe Asn Ala Thr Ala Ala 
565 570 575 

Leu Gly Gly Tyr Gly Ser Lys Ser Ala Val Ala He Gly Ala Gly Tyr 
580 585 590 

Arg Val Asn Pro Asn Leu Ala Phe Lys Aia Gly Ala Ala He Asn Thr 
595 600 605 

Ser Gly Asn Lys Lys Gly Ser Tyr Asn He Gly Val Asn Tyr Glu Phe 
610 615 620 
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(2) I N FORMAT I ON FOR SEQ ID NO: 8: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 95 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 8: 

GCCGCACCTG AC CG AG ACGC TCCGCCAAAT CAATGCGTCG GTGTACTATG CCCCGACCGA 60 

15 GCTATGCACG GATAATGGTG CGATGATCGC CTATGCTGGC TTTTGTCGGC TAAGCCGTGG 12 0 

ACAGTCGGAT GACTTGGCGG TTCGCTGCAT TCCCCGATGG GATATGACAA CGCTTGGTAT 180 

CGAATATGAT AATTAGGCTG TGGTATTTGA GTTTTGAGTA ATGTACCTAC TACCACTAAT 24 0 

20 

TTATCATACA ATACATAAAC ATAAAAAACA TCGGTATTGT TAAAAAACAA TACCCAAGTT 3 00 

AAAATAGCTC AATACTTTAC CAT AG C AC AA AGAAACTTGT GAACGAAACA TTTAATAATT 360 

25 GCCCAAAATG TCACTGCACA CACTTTGTAA AAGCAGGTTT GGGCAATGGC AAACAACGAT 42 0 

ACAAATGCAA AGGTTACCAT CACTATTTTT CTGTGAAGCA ACGAAGCAAC CAAAAAAGTA 480 
ATGACATTAA AAAAACAAGC CATTGATACA AACAGTAAAC AAATCTTAGG CTTTGTCTGT " 54 0 

30 

GGTAAAACAG ACACTAACAC CTTTAAACGA CTTTATCAGC AGTTAAATAC C C AT AAC ATT 6 00 

CAACTGTTTT TTAGTGACTA CTGGAAATCT TATCGTCAAG TCATTTTAAA G CCAAAAC AT 6 60 

35 ATAACAAGCA AAGCTCAAAC TTTTAC CAT A GAGGACTATA ATAGTCTCAT TGGGCATTTC 720 

ATAGCAAGAT TTACAAGAAA GTCAAAGTAT TATTCTAAAT CCGAAAAAAT GATAGAAAAC 7 80 

ACGTTGAATT TATTATTTGC TAAGTGGAAT GGTAGCTTAA GATATGTATT TTAATTTAAC 84 0 

40 

AATGCCAAAA ACATCAATTA CAGTAAGATT TTAGGCGTTT TGCAGTTGCT ACTTTAGTAA 9 00 

AGCTTTGTTA TACTAGCTGT TAATATACTC AAGCTTGTTT GTGTTTGAGC TATGTTTATT 96 0 

45 TTATAGCAGT AGTTGGTTAT AAAATATAAA TAAAGCTAAG CTCGAGGGTT TGGTAATGGT 1020 

TTTTTATGTT TATAATACCA AC AG AGT AT C TATACAGCTA AAATAGCTAA TACCTTAGGT 1080 

GTATTACAAG TAAAAATCCT TTGTTAATCA GGGAGTGTAT TATATGTATA TTTCCTTTGT 114 0 

50 

ATTTGGTTAT AGCAATCCCT TGGTAAGAAA TCATATCTAT TTTTTATTGT TCAATTATTC 12 00 

AGGAGACTAA GGTGAACAAA ATTTATAAAG TGAAAAAAAA TGCCGCAGGT CATTCGGTGG 1260 

55 CATGTTCTGA ATTTGCCAAA GGCCATACCA AAAAGGCAGT TTTGGGCAGT TTATTGATTG 13 2 0 

TTGGGGCATT GGGCATGGCA ACGACAGCGT CTGCACAAAC AGGCAGTACA AATGCAGCCA 13 80 
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ACGGCAATAT 


AATCAGCGGC 


GTAGGCGCGT . 


ACGTCGGTGG 


TGGCGTTATA 


AACCAAGCCA 


1440 


AAGGCAATTA 


CCCTACCGTC 


GGTGGTGGCT 


TTGATAACCG 


AGCCACAGGC 


AATTACTCTG 


1500 


TCATCAGTGG 


TGGCTTTGAT 


AACCAAGCCA 


AAGGCGAGCA 


CTCTACCATC 


GCAGGGGGTG 


1560 


AG AGTAAC C A 


AGCTACAGGT 


CGTAACTCAA 


CGGTTGCAGG 


GGGTTCTAAT 


AACCAAGCCG 


1620 


TGGGTACAAA 


CTCAACGGTT 


GCAGGGGGTT 


CTAATAACCA 


AGCCAAAGGT 


GCAAATTCAT 


1680 


TTGCAGCAGG 


TGTAGGTAAC 


CAAGCCAATA 


CCGACAACGC 


CGTCGCTCTA 


GGTAAAAATA 


1740 


ACACCATCAA 


TGGCAATAAC 


TCAGCAGCCA 


TCGGCTCTGA 


GAATACCGTT 


AACGAAAATC 


1800 


AAAAAAATGT 


CTTTATTCTT 


GGCTCTAACA 


CAACAAATGC 


ACAAAGCGGC 


TCAGTACTGC 


1860 


TAGGTCATGA 


AACCTCTGGT 


AAAGAAGCGA 


CCGCTGTTAG 


C AG AG C C AG A 


GTGAACGGCT 


1920 


TAACCCTAAA 


AAATTTTTCA 


GGCGTATCAA 


AAGCTGATAA 


TGGTACTGTA 


TCTGTCGGTA 


1980 


GTCAGGGTAA 


AGAGCGTCAA 


ATCGTTCATG 


TTGGTGCAGG 


TCAGATCAGT 


GATGATTCAA 


2040 


CAGATGCTGT 


TAATGGCTCA 


CAGCTATATG 


CTTTGGCTAC 


AGCTGTTGAT 


GACAACCAAT 


2100 


ATGACATTGA 


AATAAACCAA 


GATAATATCA 


AAGATCTTCA 


GAAGGAGGTG 


AAAGGTCTTG 


2160 


ATAAGGAAGT 


GGGTGTATTA 


AGCCGAGACA 


TTGGTTCACT 


TCATGATGAT 


GTTGCTGACA 


2220 


ACCAAGCTGA 


TATTGCTAAA 


AACAAAGCTG 


AC AT C AAAG A 


GCTTGATAAG 


GAGATGAATG 


2280 


TATTAAGCCG 


AGACATTGTC 


TCACTTAATG 


ATGATGTTGC 


TGATAACCAA 


GCTGACATTG 


2340 


CTAAAAACCA 


AGCGGATATC 


AAAACACTTG 


AAAACAATGT 


CGAAGAAGGT 


TTATTGGATC 


2400 


TAAGCGGTCG 


CCTCATTGAT 


CAAAAAGCAG 


ATATTGATAA 


TAACATCAAC 


CATATCTATG 


2460 


AGCTGGCACA 


ACAGCAAGAT 


CAGCATAGCT 


CTGATATCAA 


AACCTTGGCA 


AAAGCAAGTG 


2520 


CTGCCAATAC 


TGATCGTATT 


GCTAAAAACA 


AAGCCGATGC 


TGATGCAAGT 


TTTGAAACAC 


2580 


TCACCAAAAA 


TCAAAATACT 


TTGATTGAAA 


AAGATAAAGA 


G C ATG AC AAA 


TTAATTACTG 


2640 


CAAACAAAAC 


TGCGATTGAT 


GCCAATAAAG 


CATCTGCGGA 


TACCAAGTTT 


GCAGCGACAG 


2700 


CAGACGCCAT 


T AC C AAAAAT 


GGAAATGCTA 


TCACTAAAAA 


CGCAAAATCT 


ATCACTGATT 


2760 


TGGGTACTAA 


AGTGGATGGT 


TTTGACGGTC 


GTGTAACTGC 


ATTAGACACC 


AAAGTCAATG 


2820 


CCTTTGATGG 


TCGCATCACA 


GCTTTAGACA 


GTAAAGTTGA 


AAACGGTATG 


GCTGCCCAAG 


2880 


CTGCCCTAAG 


TGGTCTATTC 


CAGCCTTATA 


. GCGTTGGTAA 


GTTTAATGCG 


ACCGCTGCAC 


2940 


TTGGTGGCTA TGGCTCAAAA 


. TCTGCGGTTG 


CTATCGGTGC 


TGGCTATCGT 


1 GTGAATCCAA 


3000 


ATCTGGCGTT TAAAGCTGGT 


' GCGGCGATTA ATACCAGTGG 


. CAATAAAAAA 


. GGCTCTTATA 


3060 
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ACATCGGTGT GAATTACGAG TTCTAATTGT C TAT CAT C AC CAAAAAAAGC AGTCAGTTTA 312 0 

CTGGCTGCTT TTTTATGGGT TTTTATGGCT TTTGGTTGTG AGTGATGGAT AAAAGCTTAT 3180 

CAAGCGATTG ATGAATATCA ATAAATGATT GGTAAATATC AATAAAGCGG TTTAGGGTTT 3 24 0 

TTGGATATCT TTTAATAAGT TTAAAAACCC CTGCATAAAA TAAAGCTGGC ATCAG 3 295 



(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 941 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

Met Asn Lys lie Tyr Lys Val Lys Lys Asn Ala Ala Gly His Leu Val 
15 10 15 

Ala Cys Ser Glu Phe Ala Lys Gly His Thr Lys Lys Ala Val Leu Gly 
20 25 30 

Ser Leu Leu lie Val Gly lie Leu Gly Met Ala Thr Thr Ala Ser Ala 
35 40 45 

. Gin Met Ala Thr Thr Pro Ser Ala Gin Val Val Lys Thr Asn Asn Lys 
50 55 60 

Lys Asn Gly Thr His Pro Phe lie Gly Gly Gly Asp Tyr Asn Thr Thr 
65 70 75 80 

Lys Gly Asn Tyr Pro Thr lie Gly Gly Gly His Phe Asn Thr Ala Glu 
85 90 95 

Gly Asn Tyr Ser Thr Val Gly Gly Gly Phe Thr Asn Glu Ala lie Gly 
100 105 110 

Lys Asn Ser Thr Val Gly Gly Gly Phe Thr Asn Glu Ala Met Gly Glu 
115 120 125 

Tyr Ser Thr Val Ala Gly Gly Ala Asn Asn Gin Ala Lys Gly Asn Tyr 
130 135 140 

Ser Thr Val Gly Gly Gly Asn Gly Asn Lys Ala lie Gly Asn Asn Ser 
145 150 155 160 

Thr Val Val Gly Gly Ser Asn Asn Gin Ala Lys Gly Glu His Ser Thr 
165 170 175 

lie Ala Gly Gly Lys Asn Asn Gin Ala Thr Gly Asn Gly Ser Phe Ala 
180 185 190 

Ala Gly Val Glu Asn Lys Ala Asp Ala Asn Asn Ala Val Ala Leu Gly 
195 200 205 
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Asn Lys Asn Thr lie Glu Gly Thr Asn Ser Val Ala He Gly Ser Asn 
210 215 220 

Asn Thr Val Lys Thr Gly Lys Glu Asn Val Phe He Leu Gly Ser Asn 
225 230 235 240 

Thr Asn Thr Glu Asn Ala Gin Ser Gly Ser Val Leu Leu Gly Asn Asn 
245 250 255 

Thr Ala Gly Lys Ala Ala Thr Thr Val Asn Asn Ala Glu Val Asn Gly 
260 265 270 

Leu Thr Leu Glu Asn Phe Ala Gly Ala Ser Lys Ala Asn Ala Asn Asn 
275 280 285 

He Gly Thr Val Ser Val Gly Ser Glu Asn Asn Glu Arg Gin He Val 
290 295 300 

Asn Val Gly Ala Gly Gin He Ser Ala Thr Ser Thr Asp Ala Val Asn 
305 310 315 320 

Gly Ser Gin Leu His Ala Leu Ala Lys Ala Val Ala Lys Asn Lys Ser 
325 .330 335 

Asp He Lys Gly Leu Asn Lys Gly Val Lys Glu Leu Asp Lys Glu Val 
340 345 350 

Gly Val Leu Ser Arg Asp He Asn Ser Leu His Asp Asp Val Ala Asp 
355 360 365 

Asn Gin Asp Ser He Ala Lys Asn Lys Ala Asp He Lys Gly Leu Asn 
370 375 380 

Lys Glu Val Lys Glu Leu Asp Lys Glu Val Gly Val Leu Ser Arg Asp 
385 390 395 400 

He Gly Ser Leu His Asp Asp Val Ala Asp Asn Gin Asp Ser He Ala 
405 410 415 

Lys Asn Lys Ala Asp He Lys Gly Leu Asn Lys Glu Val Lys Glu Leu 
420 425 430 

Asp Lys Glu Val Gly Val Leu Ser Arg Asp He Gly Ser Leu His Asp 
435 440 445 

Asp Val Ala Thr Asn Gin Ala Asp He Ala Lys Asn Gin Ala Asp He 
450 455 460 

Lys Thr Leu Glu Asn Asn Val Glu Glu Glu Leu Leu Asn Leu Ser Gly 
465 470 475 480 

Arg Leu He Asp Gin Lys Ala Asp He Asp Asn Asn He Asn Asn He 
485 490 495 

Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp He Lys Thr 
500 505 510 
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Leu Lys Asn Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu 
515 520 525 

lie Asp Gin Lys Ala Asp Leu Thr Lys Asp lie Lys Thr Leu Lys Asn 
530 535 540 

Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu lie Asp Gin 
545 550 555 560 

Lys Ala Asp lie Ala Lys Asn Gin Ala Asp lie Ala Gin Asn Gin Thr 
565 570 575 

Asp lie Gin Asp Leu Ala Ala Tyr Asn Glu Leu Gin Asp Gin Tyr Ala 
580 585 590 

Gin Lys Gin Thr Glu Ala lie Asp Ala Leu Asn Lys Ala Ser Ser Ala 
595 600 605 

Asn Thr Asp Arg lie Ala Thr Ala Glu Leu Gly lie Ala Glu Asn Lys 
610 615 620 

Lys Asp Ala Gin lie Ala Lys Ala Gin Ala Asn Glu Asn Lys Asp Gly 
625 630 635 640 

lie Ala Lys Asn Gin Ala Asp lie Gin Leu His Asp Lys Lys lie Thr 
645 650 655 

Asn Leu Gly lie Leu His Ser Met Val Ala Arg Ala Val Gly Asn Asn 
660 665 670 

Thr Gin Gly Val Ala Thr Asn Lys Ala Asp He Ala Lys Asn Gin Ala 
675 680 685 

Asp He Ala Asn Asn He Lys Asn He Tyr Glu Leu Ala Gin Gin Gin 
690 695 700 

Asp Gin His Ser Ser Asp He Lys Thr Leu Ala Lys Val Ser Ala Ala 
705 710 715 720 

Asn Thr Asp Arg He Ala Lys Asn Lys Ala Glu Ala Asp Ala Ser Phe 
725 730 735 

Glu Thr Leu Thr Lys Asn Gin Asn Thr Leu He Glu Gin Gly Glu Ala 
740 745 750 

Leu Val Glu Gin Asn Lys Ala He Asn Gin Glu Leu Glu Gly Phe Ala 
755 760 765 

Ala His Ala Asp Val Gin Asp Lys Gin He Leu Gin Asn Gin Ala Asp 
770 775 780 

He Thr Thr Asn Lys Thr Ala He Glu Gin Asn He Asn Arg Thr Val 
785 790 795 800 

Ala Asn Gly Phe Glu He Glu Lys Asn Lys Ala Gly He Ala Thr Asn 
805 810 815 
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Lys Gin Glu Leu lie Leu Gin Asn Asp Arg Leu Asn Gin lie Asn Glu 
820 825 830 

Thr Asn Asn His Gin Asp Gin Lys lie Asp Gin Leu Gly Tyr Ala Leu 
835 840 845 

Lys Glu Gin Gly Gin His Phe Asn Asn Arg lie Ser Ala Val Glu Arg 
850 855 860 

Gin Thr Ala Gly Gly lie Ala Asn Ala lie Ala lie Ala Thr Leu Pro 
865 870 875 880 

Ser Pro Ser Arg Ala Gly Glu His His Val Leu Phe Gly Ser Gly Tyr 
885 890 895 

His Asn Gly Gin Ala Ala Val Ser Leu Gly Ala Ala Gly Leu Ser Asp 
900 905 910 

Thr Gly Lys Ser Thr Tyr Lys lie Gly Leu Ser Trp Ser Asp Ala Gly 
915 920 925 . 

Gly Leu Ser Gly Gly Val Gly Gly Ser Tyr Arg Trp Lys 
930 935 940 



(2) INFORMATION FOR SEQ ID NO : 10: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TTCTGTGAGC AAATGACTGG CGTAAATGAC TGATGAGTGT CTATTTAATG AAAG AT AT C A 6 0 

ATATATAAAA GTTGACTATA GCGATGCAAT ACAGTAAAAT TTGTTACGGC TAAACATAAC 120 

GACGGTCCAA GATGGCGGAT ATCGCCATTT ACCAACCTGA TAATCAGTTT GAT AG C CATT 18 0 

AGCGATGGCA TCAAGTTGTG TTGTTGTATT GTCATATAAA CGGTAAATTT GGTTTGGTGG 24 0 

ATGCCCCATC TGATTTACCG TCCCCCTAAT AAGTGAGGGG GGGGGGGAGA CCCCAGTCAT 3 00 

TTATTAGGAG ACTAAGATGA ACAAAATTTA TAAAGTGAAA AAAAATGCCG CAGGTCACTT 36 0 

GGTGGCGTGT TCTGAATTTG CCAAAGGTCA TACCAAAAAG GCAGTTTTGG GCAGTTTATT 4 20 

GATTGTTGGA ATATTGGGTA TGGCAACGAC AGCATCTGCA CAAATGGCAA CGACGCCGTC 480 

TGCACAAGTA GTCAAGACAA AC AAT AAAAA AAACGGCACG CACCCTTTCA TCGGTGGTGG 54 0 

CGATTATAAT AC C AC C AAAG GCAATTACCC TACCATCGGT GGTGGCCATT TTAATACCGC 6 00 

CGAAGGCAAT TACTCTACCG TCGGTGGTGG CTTTACTAAC G AAG C CAT AG GCAAGAACTC 66 0 
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TACCGTCGGT 


GGTGGCTTTA 


CTAACGAAGC 


CATGGGCGAA 


k TACTCAACCG 


i TCGCAGGCGG 


720 


TGCTAACAAC 


CAAGCCAAAG 


GCAATTACTC 


TACCGTCGGT 


GGTGGCAATG 


GCAACAAAGC 


780 


CATAGGCAAC 


AACTCAACGG 


TTGTAGGTGG 


TTCTAACAAC 


CAAGCCAAAG 


GCGAGCACTC 


840 


TACCATCGCA 


GGGGGCAAGA 


ATAACCAAGC 


TACAGGTAAT 


GGTTCATTTG 


CAGCAGGTGT 


900 


AGAGAACAAA 


GCCGATGCTA 


ACAACGCCGT 


CGCTCTAGGT 


AACAAGAACA 


CCATCGAAGG 


960 


TACAAACTCA 


GTAGCCATCG 


GCTCTAATAA 


TACCGTTAAA 


ACTGGCAAAG 


AAAATGTCTT 


1020 


TATTCTTGGC 


TCTAACACAA 


ACACAGAAAA 


TGCACAAAGT 


GGCTCCGTGC 


TGCTGGGTAA 


1080 


TAATACCGCT 


GGCAAAGCAG 


CGACCACTGT 


TAACAATGCC 


GAAGTGAACG 


GCTTAACCCT 


1140 


AGAAAATTTT 


GCAGGTGCAT 


CAAAAGCTAA 


TGCTAATAAT 


ATTGGTACTG 


TATCTGTCGG 


1200 


TAGTGAGAAT 


AATGAGCGTC 


AAATCGTTAA 


TGTTGGTGCA 


GGTCAGATCA 


GTGCCACCTC 


1260 


AACAGATGCT 


GTTAATGGCT 


CACAGCTACA 


TGCTTTAGCC 


AAAGCTGTTG 


CTAAAAACAA 


1320 


ATCTGACATC 


AAAGGTCTTA 


ATAAGGGGGT 


GAAAGAGCTT 


GATAAGGAGG 


TGGGTGTATT 


1380 


AAGCCGAGAC 


ATTAATTCAC 


TTCATGATGA 


TGTTGCTGAC 


AACCAAGATA 


GCATTGCTAA 


1440 


AAACAAAGCT 


GACATCAAAG 


GTCTTAATAA 


GGAGGTGAAA 


GAGCTTGATA 


AGGAGGTGGG 


1500 


TGTATTAAGC 


CGAGACATTG 


GTTCACTTCA 


TGATGATGTT 


GCTGACAACC 


AAGATAGCAT 


1560 


TGCTAAAAAC 


AAAGCTGACA 


TCAAAGGTCT 


TAATAAGGAG 


GTGAAAGAGC 


TTGATAAGGA 


1620 


GGTGGGTGTA 


TTAAGCCGAG 


ACATTGGTTC 


ACTTCATGAT 


GATGTTGCCA 


CCAACCAAGC 


1680 


TGACATTGCT 


AAAAACCAAG 


CGGATATCAA 


AACACTTGAA 


AACAATGTCG 


AAGAAGAATT 


1740 


ATTAAATCTA 


AGCGGTCGCC 


TCATTGATCA 


AAAAGCGGAT 


ATTGATAATA 


ACATCAACAA 


1800 


TATCTATGAG 


CTGGCACAAC 


AGCAAGATCA 


GCATAGCTCT 


GATATCAAAA 


CACTTAAAAA 


1860 


CAATGTCGAA 


GAAGGTTTGT 


TGGATCTAAG 


CGGTCGCCTC 


ATTGATCAAA 


AAGCAGATCT 


1920 


TACGAAAGAC 


ATCAAAACAC 


TTAAAAACAA 


TGTCGAAGAA 


GGTTTATTGG 


ATCTAAG CGG 


1980 


TCGCCTCATT 


GATCAAAAAG 


CAGATATTGC 


TAAAAACCAA 


GCTGACATTG 


CTCAAAACCA 


2040 


AACAGACATC 


CAAGATCTGG 


CCGCTTACAA 


CGAGCTACAA 


G AC C AGTATG 


CTCAAAAGCA 


2100 


AACCGAAGCG 


ATTGACGCTC 


TAAATAAAGC 


AAGCTCTGCC 


AATACTGATC 


GTATTGCTAC 


2160 


TGCTGAATTG 


GGTATCGCTG 


AGAACAAAAA 


AGACGCTCAG 


ATCGCCAAAG 


CACAAGCCAA 


2220 


TGAAAATAAA 


GACGGCATTG 


CTAAAAACCA 


AGCTGATATC 


CAGTTGCACG 


ATAAAAAAAT 


2280 


CACCAATCTA 


GGTATCCTTC 


AC AG C ATGG T 


TGCAAGAGCG 


GTAGGAAATA , 


ATACACAAGG 


2340 
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TGTTGCTACC AACAAAGCTG ATATTGCTAA AAACCAAGCA GATATTGCTA ATAACATCAA 24 00 

AAATATCTAT GAGCTGGCAC AACAGCAAGA TCAGCATAGC TCTGATATCA AAACCTTGGC 24 6 0 

AAAAGTAAGT GCTGCCAATA CTGATCGTAT TGCTAAAAAC AAAGCTGAAG CTGATGCAAG 2 52 0 

TTTTGAAACG CTCACCAAAA ATCAAAATAC TTTGATTGAG CAAGGTGAAG CATTGGTTGA 2 5 80 

GCAAAATAAA GCCATCAATC AAGAGCTTGA AGGGTTTGCG GCTCATGCAG ATGTTCAAGA 2 64 0 

TAAGCAAATT TTACAAAACC AAGCTGATAT CACTACCAAT AAGACCGCTA TTGAACAAAA 2 7 00 

TATCAATAGA ACTGTTGC C A ATGGGTTTGA GATTGAGAAA AATAAAGCTG GTATTGCTAC 2 76 0 

CAATAAGCAA GAGCTTATTC TTCAAAATGA TCGATTAAAT CAAATTAATG AGACAAATAA 2 82 0 

TCATCAGGAT CAGAAGATTG ATCAATTAGG TTATGCACTA AAAGAGCAGG GTCAGCATTT 2 880 

TAATAATCGT ATTAGTGCTG TTGAGCGTCA AACAGCTGGA GGTATTGCAA ATGCTATCGC 2 94 0 

AATTGCAACT TTACCATCGC CCAGTAGAGC AGGTGAGCAT CATGTCTTAT TTGGTTCAGG 3 0 00 

TTATCACAAT GGTCAAGCTG CGGTATCATT GGGCGCGGCT GGATTAAGTG ATACAGGAAA 3 06 0 

AT C AAC TT AT AAGATTGGTC TAAGCTGGTC AGATGCAGGT GGATTATCTG GTGGTGTTGG 312 0 

TGGCAGTTAC CGCTGGAAAT AGAGCCTAAA TTTAACTGCT GTATCAAAAA ATATGGTCTG 318 0 

TATAAACAGA CCATATTTTT ATCTAAAAAA CTTATCTTAA CTTTTATGAA GCATCATAAG 3 24 0 

CCAAAGCTGA GTAATAATAA GAGATGTTAA AATAAGAGAT GTTAAAACTG CTAAACAATC 3 3 00 

GGCTTGCGAC GATAAAATAA AATACCTGGA ATGGACAGCC CCAAAACCAA TGCTGAGATG 3 36 0 

ATAAAAATCG CCTCAAAAAA ATGACGCATC ATAACGATAA ATAAATCCAT ATCAAATCCA 3420 

AAATAGCCAA TTTGTACCAT GCTAACCATG GCTTTATAGG CAGCGATTCC CGGCATCATA 348 0 

CAAATCAAGC TAGGTACAAT CAAGGCTTTA GGCGGCAGGC CATGACGCTG AGCAAAAA 3 53 8 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Lys Leu Leu Pro Leu Lys lie Ala Val Thr Ser Ala Met lie lie 
1 5 10 15 

Gly Leu Gly Ala Ala' Ser Thr Ala Asn Ala Gin Ser Arg Asp Arg Ser 
20 25 30 
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Leu Glu Asp lie Gin Asp Ser lie Ser Lys Leu Val Gin Asp Asp lie 

35 40 45 

Asp Thr Leu Lys Gin Asp Gin Gin Lys Met Asn Lys Tyr Leu Leu Leu 
50 55 60 

Asn Gin Leu Ala Asn Thr Leu lie Thr Asp Glu Leu Asn Asn Asn Val 
65 70 75 80 

lie Lys Asn Thr Asn Ser lie Glu Ala Leu Gly Asp Glu lie Gly Trp 
85 90 95 

Leu Glu Asn Asp lie Ala Asp Leu Glu Glu Gly Val Glu Glu Leu Thr 
100 105 1X0 

Lys Asn Gin Asn Thr Leu lie Glu Lys Asp Glu Glu His Asp Arg Leu 
115 120 125 

lie Ala Gin Asn Gin Ala Asp lie Gin Thr Leu Glu Asn Asn Val Val 
130 135 140 

Glu Glu Leu Phe Asn Leu Ser Gly Arg Leu lie Asp Gin Glu Ala Asp 
145 150 155 160 

lie Ala Lys Asn Asn Ala Ser lie Glu Glu Leu Tyr Asp Phe Asp Asn 
165 170 175 

Glu Val Ala Glu Arg lie Gly Glu lie His Ala Tyr Thr Glu Glu Val 
180 185 190 

Asn Lys Thr Leu Glu Asn Leu lie Thr Asn Ser Val Lys Asn Thr Asp 
195 200 205 

Asn lie Asp Lys Asn Lys Ala Asp lie Asp Asn Asn lie Asn His lie 
210 215 220 

Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp lie Lys Thr 
225 230 235 240 

Leu Lys Asn Asn Val Glu Glu Gly Leu Leu Glu Leu Ser Gly His Leu 
245 250 255 

lie Asp Gin Lys Ala Asp Leu Thr Lys Asp lie Lys Ala Leu Glu Ser 
260 265 270 

Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu Leu Asp Gin 
275 280 285 

Lys Ala Asp Leu Thr Lys Asp lie Lys Ala Leu Glu Ser Asn Val Glu 
290 295 300 

Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp 
305 310 315 320 

lie Ala Gin Asn Gin Thr Asp lie Gin Asp Leu Ala Ala Tyr Asn Glu 
325 330 335 
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Leu Gin Asp Gin Tyr Ala Gin Lys Gin Thr Glu Ala He Asp Ala Leu 
340 345 350 

Asn Lys Ala Ser Ser Glu Asn Thr Gin Asn He Glu Asp Leu Ala Ala 
355 360 365 

Tyr Asn Glu Leu Gin Asp Ala Tyr Ala Lys Gin Gin Thr Glu Ala He 
370 375 380 

Asp Ala Leu Asn Lys Ala Ser Ser Glu Asn Thr Gin Asn He Ala Lys 
385 390 395 400 

Asn Gin Ala Asp He Ala Asn Asn He Asn Asn He Tyr Glu Leu Ala 
405 410 415 

Gin Gin Gin Asp Gin His Ser Ser Asp He Lys Thr Leu Ala Lys Ala 
420 425 430 

Ser Ala Ala Asn Thr Asn Arg He Ala Thr Ala Glu Leu Gly He Ala 
435 440 445 

Glu Asn Lys Lys Asp Ala Gin He Ala Lys Ala Gin Ala Asn Ala Asn 
450 455 460 

Lys Thr Ala He Asp Glu Asn Lys Ala Ser Ala Asp Thr Lys Phe Ala 
465 470 475 480 

Ala Thr Ala Asp Ala He Thr Lys Asn Gly Asn Ala He Thr Lys Asn 
485 490 495 

Ala Lys Ser He Thr Asp Leu Gly Thr Lys Val Asp Gly Phe Asp Gly 
500 505 510 

Arg Val Thr Ala Leu Asp Thr Lys Val Asn Ala Phe Asp Gly Arg He 
515 520 525 

Thr Ala Leu Asp Ser Lys Val Glu Asn Gly Met Ala Ala Gin Ala Ala 
530 535 540 

Leu Ser Gly Leu Phe Gin Pro Tyr Ser Val Gly Lys Phe Asn Ala Thr 
545 550 555 560 

Ala Ala Leu Gly Gly Tyr Gly Ser Lys Ser Ala Val Ala He Gly Ala 
565 570 575 

Gly Tyr Arg Val Asn Pro Asn Leu Ala Phe Lys Ala Gly Ala Ala He 
580 585 590 

Asn Thr Ser Gly Asn Lys Lys Gly Ser Tyr Asn He Gly Val Asn Tyr 
595 600 605 

Glu Phe 
610 



(2) INFORMATION FOR SEQ ID NO: 12: 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 




PCT/US97/23930 



(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 2673 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCATCAGTAC ATACGCCGCA CCTGACCGAG ACGCTCCGCC AAATCAATGC GTCGGTGTAC 6 0 

TACGCCCCGA CCGAGCTATG CACGGATAAT GGTGCGATGA TCGCTTACGC TGGCTTTTGT 12 0 

CGGCTAAGCC GTGGACAGTC GGATGACTTG GCGGTTCGCT GCATTCCCCG ATGGGATATG 18 0 

ACAACGCTTG GCGTATCTGC TCATAGATAG CCACATCAAT CATACCAACG ATATTGGTAT 24 0 

ATACCAAATT GATACCTGCC AAAAAT AC C A TATTGAAAGT AGGGTTTGGG TATTATTTAT 3 00 

GTAACTTATA TCTAATTTGG TGTTGATACT TTGATAAAGC CTTGCTATAC TGTAACCTAA 36 0 

ATGGATATGA TAGAGATTTT TCCATTTATG CCAGCAAAAG AGATAGATAG ATAGATAGAT 42 0 

AGATAGATAG ATAGATAGAT AGATAGATAG ATAGATAAAA CTCTGTCTTT TATCTGT C C A 4 80 

CTGATGCTTT CTGCCTGCCA CCGATGATAT CGTTTATCTG CTTTTTTAGG CATCAGTTAT 54 0 

TTC AC CGTGA TGACTGATGT GATGACTTAA CCACCAAAAG AGAGTGCTAA ATGAAAACCA 6 00 

TGAAACTTCT CCCTCTAAAA ATCGCTGTAA CCAGTGCCAT GATTATTGGT TTGGGTGCGG 66 0 

CATCTACTGC G AATG C AC AG TCTCGGGATA GATCTTTAGA AGATATACAA GATTCAATTA 72 0 

GTAAACTTGT TCAAGATGAT ATAGATACAC TAAAACAAGA TCAGCAGAAG ATGAACAAGT 78 0 

ATCTGTTGCT CAACCAGTTA GCTAATACTT TAATTACAGA CGAGCTCAAC AATAATGTTA 84 0 

TAAAAAACAC CAATTCTATT GAAGCTCTTG GTGATGAGAT TGGATGGCTT GAAAATGATA 900 

TTGCAGACTT GGAAGAAGGT GTTGAAGAAC TCACCAAAAA CCAAAATACT TTGATTGAAA 96 0 

AAGATGAAGA G C ATG AC AG A TTAATCGCTC AAAATCAAGC TG AT AT C C AA ACACTTGAAA 10 2 0 

ACAATGTCGT AGAAGAACTA TTCAATCTAA GCGGTCGCCT AATTGATCAA GAAGCGGATA 108 0 

TTGCTAAAAA TAATGCTTCT ATTGAAGAGC TTTATGATTT TGATAATGAG GTTGCAGAAA 114 0 

GGATAGGTGA GAT AC ATG C T TAT AC TG AAG AGGTAAATAA AACTCTTGAA AACTTGATAA 12 0 0 

CAAACAGTGT TAAGAATACT GATAATATTG ACAAAAACAA AGCTGATATT GATAATAACA 12 6 0 

TCAACCATAT CTATGAGCTG GCACAACAGC AAGATCACCA TAGCTCTGAT ATCAAAACAC 13 2 0 

TTAAAAACAA TGTCGAAGAA GGTTTGTTGG AGCTAAG CGG TCACCTCATT GAT C AAAAAG 13 8 0 

CGGATCTTAC AAAAGACATC AAAGCACTTG AAAGCAATGT CGAAGAAGGT TTGTTGGATC 14 4 0 

TAAGCGGTCG TCTGCTTGAT CAAAAAGCGG ATCTTACAAA AGACATCAAA GCACTTGAAA 15 00 
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GCAATGTCGA 


AGAAGGTTTG 


TTGGATCTAA 


GCGGTCGTCT 


GCTTGATCAA 


AAAGCGGATA 


1560 


TTGCTCAAAA 


CCAAACAGAC 


ATCCAAGATC 


TGGCCGCTTA 


CAACGAGCTA 


CAAGACCAGT 


1620 


ATGCTCAAAA 


GCAAACCGAA 


GCGATTGACG 


CTCTAAATAA 


AGCAAGCTCT 


GAGAATACAC 


1680 


AAAACATCGA 


AGATCTGGCC 


GCTTACAATG 


AGCTACAAGA 


TGCCTATGCC 


AAAC AG C AAA 


1740 


CCGAAGCGAT 


TGACGCTCTA 


AATAAAGCAA 


GCTCTGAGAA 


TACACAAAAC 


ATTGCTAAAA 


1800 


ACCAAGCGGA 


TATTGCTAAT 


AACATCAACA 


ATATCTATGA 


GCTGGCACAA 


CAGCAAGATC 


1860 


AGCATAGCTC 


TGATATCAAA 


ACCTTGGCAA 


AAGCAAGTGC 


TGCCAATACT 


AATCGTATTG 


1920 


CTACTGCTGA 


ATTGGGCATC 


GCTGAGAACA 


AAAAAGACGC 


TCAGATCGCC 


AAAGCACAAG 


1980 


CGAATGCCAA 


CAAAACTGCG 


ATTGATGAAA 


ACAAAGCATC 


TGCGGATACC 


AAGTTTGCAG 


2040 


CAACAGCAGA 


CGCCATTACC 


AAAAATGGAA 


ATGCTATCAC 


TAAAAACGCA 


AAATCTATCA 


2100 


CTGATTTGGG 


CACTAAAGTG 


GATGGTTTTG 


ACGGTCGTGT 


AACTGCATTA 


GACACCAAAG 


2160 


TCAATGCCTT 


TGATGGTCGT 


ATCACAGCTT 


TAGACAGTAA 


AGTTGAAAAC 


GGTATGGCTG 


2220 


CCCAAGCTGC 


CCTAAGTGGT 


CTATTCCAGC 


CTTATAGCGT 


TGGTAAGTTT 


AATGCGACCG 


2280 


CTGCACTTGG 


TGGCTATGGC 


TCAAAATCTG 


CGGTTGCTAT 


CGGTGCTGGC 


TATCGTGTGA 


2340 


ATCCAAATCT 


GGCGTTTAAA 


GCTGGTGCGG 


CGATTAATAC 


CAGTGGCAAT 


AAAAAAGGCT 


2400 




CddTClTCl A AT 


TACGAGTTCT 


AATTGTCTAT 


CATCACCAAA 


AAAAGCAGTC 


2460 


AGTTTACTGG 


CTGCTTTTTT 


ATGGGTTTTT 


GTGGCTTTTG 


GTTGTGAGTG 


ATGGATAAAA 


2520 


GCTTATCAAG 


CGATTGATGA 


ATATCAATAA 


ATGATTGGTA 


AATATCAATA 


AAGCGGTTTA 


2580 


GGGTTTTTGG 


ATATCTTTTA 


ATAAGTTTAA 


AAACCCCTGC 


ATAAAATAAA 


GCTGGGCATC 


2640 


AGAGCTGCGA 


GTAGCGGCAT 


ACAGCGGGAG 


ATC 






2673 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 873 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : . 

. (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Asn Lys He Tyr Lys Val Lys Lys Asn Ala Ala Gly His Leu Val 
x 5 10 15 

Ala Cys Ser GIu Phe Ala Lys Gly His Thr Lys Lys Ala Val Leu Gly 
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Ser Leu Leu lie Val Gly lie Leu Gly Met Ala Thr Thr Ala Ser Ala 
35 40 45 

Gin Gin Thr lie Ala Arg Gin Gly Lys Gly Met His Ser lie lie Gly 
50 55 60 

Gly Gly Asn Asp Asn Glu Ala Asn Gly Asp Tyr Ser Thr Val Ser Gly 
65 70 75 80 

Gly Asp Tyr Asn Glu Ala Lys Gly Asp Ser Ser Thr lie Gly Gly Gly 
85 90 95 

Tyr Tyr Asn Glu Ala Asn Gly Asp Ser Ser Thr lie Gly Gly Gly Phe 
100 105 no 

Tyr Asn Glu Ala Lys Gly Glu Ser Ser Thr He Gly Gly Gly Asp Asn 
115 120 125 

Asn Ser Ala Thr Gly Met Tyr Ser Thr He Gly Gly Gly Asp Asn Asn 
130 135 140 

Ser Ala Thr Gly Arg Tyr Ser Thr He Ala Gly Gly Trp Leu Asn Gin 
145 150 155 160 

Ala Thr Gly His Ser Ser Thr Val Ala Gly Gly Trp Leu Asn Gin Ala 
165 170 175 

Thr Asn Glu Asn Ser Thr Val Gly Gly Gly Arg Phe Asn Gin Ala Thr 
180 185 190 

Gly Arg Asn Ser Thr Val Ala Gly Gly Tyr Lys Asn Lys Ala Thr Gly 
195 200 205 

Val Asp Ser Thr He Ala Gly Gly Arg Asn Asn Gin Ala Asn Gly He 
210 215 220 

Gly Ser Phe Ala Ala Gly He Asp Asn Gin Ala Asn Ala Asn Asn Thr 
225 230 235 240 

Val Ala Leu Gly Asn Lys Asn He He Lys Gly Lys Asp Ser Val Ala 
245 250 255 

He Gly Ser Asn Asn Thr Val Glu Thr Gly Lys Glu Asn Val Phe He 
260 265 270 

Leu Gly Ser- Asn Thr Lys Asp Ala His Ser Asn Ser Val Leu Leu Gly 
275 280 285 ■ 

Asn Glu Thr Thr Gly Lys Ala Ala Thr Thr Val Glu Asn Ala Lys Val 
290 295 300 

Gly Gly Leu Ser .Leu Thr Gly Phe Val Gly Ala Ser Lys Ala Asn Thr 
305 310 315 320 

Asn Asn Gly Thr Val Ser Val Gly Lys Gin Gly Lys Glu Arg Gin He 
325 330 335 
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25 



40 



55 



Val Asn Val Gly Ala Gly Gin lie Arg Ala Asp Ser Thr Asp Ala Val 
340 345 350 

Asn Gly Ser Gin Leu His Ala Leu Ala Thr Ala Val Asp Ala Glu Phe 
355 360 365 

Arg Thr Leu Thr Gin Thr Gin Asn Ala Leu lie Glu Gin Gly Glu Ala 
370 375 380 

lie Asn Gin Glu Leu Glu Gly Leu Ala Asp Tyr Thr Asn Ala Gin Asp 
385 390 395 400 



Glu Lys lie Leu Lys Asn Gin Thr Asp lie Thr Ala Asn Lys Thr Ala 
15 405 410 415 

lie Glu Gin Asn Phe Asn Arg Thr Val Thr Asn Gly Phe Glu lie Glu 
420 425 430 

20 Lys Asn Lys Ala Gly lie Ala Lys Asn Gin Ala Asp He Gin Thr Leu 

435 440 445 



Glu Asn Asp Val Gly Lys Glu Leu Leu Asn Leu Ser Gly Arg Leu Leu 
450 455 460 

Asp Gin Lys Ala Asp He Asp Asn Asn He Asn Asn He Tyr Glu Leu 

465 470 475 480 



Ala Gin Gin Gin Asp Gin His Ser Ser Asp He Lys Thr Leu Lys Asn 
30 485 490 495 

Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu He Asp Gin 
500 505 510 

35 Lys Ala Asp Leu Thr Lys Asp He Lys Ala Leu Glu Asn Asn Val Glu 

515 520 525 



Glu Gly Leu Leu Asp Leu Ser Gly Arg Leu He Asp Gin Lys Ala Asp 

530 535 540 

He Ala Lys Asn Gin Ala Asp He Gin Asp Leu Ala Ala Tyr Asn Glu 

545 550 555 560 



Leu Gin Asp Gin Tyr Ala Gin Lys Gin Thr Glu Ala He Asp Ala Leu 
45 565 570 575 

Asn Lys Ala Ser Ser Ala Asn Thr Asp Arg He Ala Thr Ala Glu Leu 
580 585 590 

50 Gly He Ala Glu Asn Lys Lys Asp Ala Gin He Ala Lys Ala Gin Ala 

595 600 605 



Asn Glu Asn Lys Asp Gly lie Ala Lys Asn Gin Ala Asp He Ala Asn 

610 615 ' 620 

Asn lie Lys Asn lie Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser 

625 630 635 640 
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Ser Asp ife Lys Thr Leu Ala Lys Val Ser Ala Ala Asn Thr Asp Arg 
645 650 655 

lie Ala Lys Asn Lys Ala Glu Ala Asp Ala Ser Phe Glu Thr Leu Thr 
660 665 670 

Lys Asn Gin Asn Thr Leu lie Glu Gin Gly Glu Ala Leu Val Glu Gin 
675 680 685 

Asn Lys Ala lie Asn Gin Glu Leu Glu Gly Phe Ala Ala His Ala Asp 
690 695 700 

Val Gin Asp Lys Gin He Leu Gin Asn Gin Ala Asp He Thr Ala Asn 
705 710 715 720 

Lys Thr Ala lie Glu Gin Asn He Asn Arg Thr Val Ala Asn Gly Phe 
725 730 735 

Glu He Glu Lys Asn Lys Ala Gly He Ala Thr Asn Lys Gin Glu Leu 
740 745 750 

He Leu Gin His Asp Arg Leu Asn Arg He Asn Glu Thr Asn Asn Arg 
755 760 765 

Gin Asp Gin Lys He Asp Gin Leu Gly Tyr Ala Leu Lys Glu Gin Gly 
770 775 780 

Gin His Phe Asn Asn Arg He Ser Ala Val Glu Arg Gin Thr Ala- Gly 
785 790 795 800 

Gly He Ala Asn Ala He Ala He Ala Thr Leu Pro Ser Pro Ser Arg 
805 810 815 

Ala Gly Glu His His Val Leu Phe Gly Ser Gly Tyr His Asn Gly Gin 
820 825 830 

Ala Ala Val Ser Leu Gly Ala Ala Gly Leu Ser Asp Thr Gly Lys Ser 
835 840 845 

Thr Tyr Lys He Gly Leu Ser Trp Ser Asp Ala Gly Gly Leu Ser Gly 
850 855 860 

Gly Val Gly Gly Ser Tyr Arg Trp Lys 
865 870 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3292 base pairs 

(B) TYPE: nucleic acid 
.(C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
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GTAAATGACT 


GATGAGTGTC 


TATTTAATGA 


AAGATACAAT 


ATATAAAAGT 


TGACTATAGC 


60 


GATGCAATAC 


AGTAAAATTT 


GTTACGGCTA 


AACATAACGA 


CGGTCCAAGA 


TGGCGGATAT 


120 


CGCCATTTAC 


CAACCTGATA 


ATCAGTTTGA 


TAGCCATTAG 


CGATGGCATC 


AAGTTGTGTT 


180 


GTTGTATTGT 


CATATAAACG 


GTAAATTTGG 


TTTGGTGGAT 


GCCCCATCTG 


ATTTACCGTC 


240 


CCCCTAATAA 


GTGAGAGGGG 


GGGGGAGACC 


CCAGTCATTT 


ATTAGGAGAC 


TAAGATGAAC 


300 


AAAATTTATA 


AAGTGAAAAA 


AAATGCCGCA 


GGTCACTTGG 


TGGCATGTTC 


TGAATTTGCC 


360 


AAAGGCCATA 


CCAAGAAGGC 


AGTTTTGGGC 


AGTTTATTGA 


TTGTTGGAAT 


ATTGGGTATG 


420 


GCAACGACAG 


CATCTGCACA 


ACAAACAATC 


GCACGCCAAG 


GCAAAGGCAT 


GCACTCTATC 


480 


ATCGGTGGTG 


GCAATGACAA 


CGAAGCCAAC 


GGCGATTACT 


CTACCGTCAG 


TGGTGGCGAT 


540 


TATAACGAAG 


CCAAAGGCGA 


TAGCTCTACC 


ATCGGTGGTG 


G C T ATT AT AA 


CGAAGCCAAC 


600 


GGCGATAGCT 


CTACCATCGG 


TGGTGGCTTT 


TATAACGAAG 


CCAAAGGCGA 


GAGCTCTACC 


660 


ATCGGTGGTG 


GCGATAACAA 


CTCAGCCACA 


GGCATGTACT 


CTACCATCGG 


TGGTGGCGAT 


720 


AACAACTCAG 


CCACAGGCAG 


GTACTCTACC 


ATCGCAGGGG 


GTTGGCTTAA 


CCAAGCTACA 


780 


GGTCATAGCT 


CAACGGTTGC 


AGGGGGTTGG 


CTTAACCAAG 


CTACAAACGA 


GAATTCTACC 


840 


GTTGGTGGCG 


GCAGGTTTAA 


CCAAGCTACA 


GGTCGTAACT 


CAACGGTTGC 


AGGGGGCTAT 


900 


AAAAACAAAG 


CCACAGGCGT 


AGACTCTACC 


ATCGCAGGGG 


GCAGGAATAA 


CCAAGCCAAC 


960 


GGTATAGGTT 


CATTTGCAGC 


AGGTATAGAC 


AACCAAGCCA 


ATGCCAACAA 


CACCGTCGCT 


1020 


CTAGGTAACA 


AGAACATCAT 


CAAAGGTAAA 


GACTCAGTAG 


CCATCGGCTC 


TAATAATACC 


1080 


GTTGAAACTG 


GCAAAGAAAA 


TGTCTTTATT 


CTTGGCTCTA 


ACACAAAAGA 


TGCACATAGT 


1140 


AACTCAGTGC 


TACTGGGTAA 


TG AG AC C AC T 


GGCAAAGCAG 


CGACCACTGT 


TGAGAATGCC 


1200 


AAAGTGGGTG 


GTCTAAGCCT 


AACAGGATTT 


GTAGGTGCAT 


CAAAAGCTAA 


TACTAATAAT 


1260 


GGTACTGTAT 


CTGTCGGTAA 


GCAGGGTAAA 


GAGCGTCAAA 


TCGTTAATGT 


TGGTGCAGGT 


1320 


CAGATCCGTG 


CTGATTCAAC 


AGATGCTGTT 


AATGGCTCAC 


AGCTACATGC 


TTTGGCCACA 


1380 


GCTGTCGATG 


CAGAATTTAG 


AACACTCACC 


CAAACTCAAA 


ATGCTTTGAT 


TGAGCAAGGT 


1440 


GAAGCCATCA 


ATCAAGAGCT 


TGAAGGTTTG 


GC AG ATT AT A 


CAAATGCTCA 


AGATGAGAAA 


1500 


ATTCTAAAAA 


ACCAAACTGA 


CATCACTGCC 


AATAAAACTG 


CTATTGAGCA 


AAATTTTAAT 


1560 


AGAACTGTTA 


CCAATGGGTT 


TGAGATTGAG 


AAAAATAAAG 


CTGGTATTGC 


TAAAAACCAA 


1620 


GCGGATATCC 


AAACACTTGA 


AAACGATGTC 


GGAAAAGAAC 


TATTAAATCT 


AAGCGGTCGC 


1680 


CTGCTTGATC 


AAAAAG C AG A 


. TATTGATAAT 


1 AACATCAACA 


ATATCTATGA 


GCTGGCACAA 


1740 
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CAGCAAGATC AG CAT AG C T C TG AT AT C AAA ACACTTAAAA ACAATGTCGA AGAAGGTTTG 18 00 

TTGGATCTAA GCGGTCGCCT CATTGATCAA AAAGCAGATC TTAC GAAAG A CATC AAAG C A 18 6 0 

5 

CTTGAAAACA ATGTCGAAGA AGGTTTATTG GATCTAAGCG GTCGCCTCAT TG AT C AAAAA 192 0 

GCAGATATTG CTAAAAACCA AGCAGACATC CAAGATTTGG CCGCTTACAA CGAGCTACAA 1980 

10 GACCAGTATG CTCAAAAGCA AACCGAAGCG AT TG ACGCTC TAAATAAAGC AAGCTCTGCC 2 04 0 

AATACTGATC GTATTGCTAC TGCTGAATTG GGTATCGCTG AGAACAAAAA AGACGCTCAG 210 0 

ATCGCCAAAG CACAAGCCAA TGAAAATAAA GACGGCATTG CTAAAAACCA AGCAGATATT 216 0 

15 

GCTAATAACA TCAAAAATAT CTATGAGCTG G C AC AAC AG C AAGATCAGCA TAGCTCTGAT 22 2 0 

ATCAAAACCT TGGCAAAAGT AAGTGCTGCC AATACTGATC GTATTG CTAA AAACAAAGCT 22 8 0 

20 GAAGCTGATG CAAGTTTTGA AACGCTCACC AAAAATCAAA ATACTTTGAT TGAGCAAGGT 2 340 

GAAGCATTGG TTGAGCAAAA TAAAGCCATC AATCAAGAGC TTGAAGGGTT TGCGGCTCAT 24 00 

GCAGATGTTC AAGATAAGCA AATTTTACAA AACCAAGCTG ATATCACTGC CAATAAGACC 246 0 

25 

GCTATTGAAC AAAATATCAA TAGAACTGTT GCCAATGGGT TTGAGATTGA GAAAAATAAA 2 52 0 

GCTGGTATTG CTACCAATAA GCAAGAGCTT ATTCTTCAAC ATGATCGATT AAATCGAATT 2 5 80 

30 AATGAGACAA ATAATCGTCA GGATCAGAAG ATTGATCAAT TAGGTTATGC ACTAAAAGAG 2 64 0 

CAGGGTCAGC ATTTTAATAA TCGTATTAGT GCTGTTGAGC GTCAAACAGC TGGAGGTATT 2 70 0 

GCAAATGCTA TCGCAATTGC AACTTTACCA TCGCCCAGTA GAGCAGGTGA GCATCATGTC 2 76 0 

35 

TTATTTGGTT CAGGTTATCA CAATGGTCAA GCTGCGGTAT CATTGGGTGC GGCTGGGTTA 2 820 

AGTGATACAG GAAAATCAAC TTATAAGATT GGTCTAAGCT GGTCAGATGC AGGTGGATTA 288 0 

40 TCTGGTGGTG TTGGTGGTAG TTACCGCTGG AAAT AG AG C C TAAATTTAAC TGCTGTATCA 2 94 0 

. AAAAATATGG TCTGTATAAA CAGACCATAT TTTTATCTAA AAACTTATCT TAACTTTTAT 3 00 0 

GAAG CATC AT AAGCCAAAGC TGAGTAATAA TAAGAGATGT TAAAATAAGA GATGTTAAAA 3 06 0 

45 

CTGCTAAACA ATCGGCTTAC GACGATAAAA TAAAATACCT GGAATGGACA GCCCCAAAAC 312 0 

CAATGCTGAG ATGAT AAAAA TCGCCTCAAA AAAATG AC G C ATCATAACGA T AAAT AAAT C 318 0 

50 C ATAT C AAAT C.C AAAAT AG C CAATTTGTAC CATGCTAACC ATGGCTTTAT AGGCAGCGAT 324 0 

TCCCGGCATC ATACAAATCA AGCTAGGTAC AATCAAGGCT TTAGGCGGCA GG 3 2 92 

55 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 889 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Val Asn Lys He Tyr Lys Val Lys Lys Asn Ala Ala Gly His Leu Val 
1 5 10 15 

Ala Cys Ser Glu Phe Ala Lys Gly His Thr Lys Lys Ala Val Leu Gly 
20 25 30 

Ser Leu Leu He Val Gly Ala Leu Gly Met Ala Thr Thr Ala Ser Ala 
35 40 45 

Gin Pro Leu Val Ser Thr Asn Lys Pro Asn Gin Gin Val Lys Gly Tyr 
50 55 60 

Trp Ser He He Gly Ala Gly Arg His Asn Asn Val Gly Gly Ser Ala 
65 70 75 80 

His His Ser Gly He Leu Gly Gly Trp Lys Asn Thr Val Asn Gly Tyr 
85 90 95 

Thr Ser Ala He Val Gly Gly Tyr Gly Asn Glu Thr Gin Gly Asp Tyr 
100 105 HO 

Thr Phe Val Gly Gly Gly Tyr Lys Asn Leu Ala Lys Gly Asn Tyr Thr 
115 120 125 

Phe Val Gly Gly Gly Tyr Lys Asn Leu Ala Glu Gly Asp Asn Ala Thr 
130 135 140 

He Ala Gly Gly Phe Ala Asn Leu Ala Glu Gly Asp Asn Ala Thr lie 
145 150 155 160 

Ala Gly Gly Phe Glu Asn Arg Ala Glu Gly He Asp Ser Val Val Ser 
165 170 175 

Gly Gly Tyr Ala Asn Gin Ala Thr Gly Glu Ser Ser Thr Val Ala Gly 
180 185 190 

Gly Ser Asn Asn Leu Ala Glu Gly Lys Ser Ser Ala lie Gly. Gly Gly 
195 200 205 

Arg Gin Asn Glu Ala Ser Gly Asp Arg Ser Thr Val Ser Gly Gly Tyr 
210 215 220 

Asn Asn Leu Ala Glu Gly Lys Ser Ser Ala He Gly Gly Gly Glu Phe 
225 230 235 240 

Asn Leu Ala Leu Gly. Asn Asn Ala Thr He Ser Gly Gly Arg Gin Asn 
245 250 255 

Glu Ala Ser Gly Asp Arg Ser Thr Val Ala Gly Gly Glu Gin Asn Gin 
260 265 270 
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Ala lie Gly Lys Tyr Ser Thr lie Ser Gly Gly Arg Gin Asn Glu Ala 
275 280 285 

Ser Gly Asp Arg Ser Thr Val Ala Gly Gly Glu Gin Asn Gin Ala lie 
290 295 300 

Gly Lys Tyr Ser Thr Val Ser Gly Gly Tyr Arg Asn Gin Ala Thr Gly 
305 310 315 320 

Lys Gly Ser Phe Ala Ala Gly lie Asp Asn Lys Ala Asn Ala Asp Asn 
325 330 335 



Ala Val Ala Leu Gly Asn Lys Asn Thr lie Glu Gly Glu Asn Ser Val 
15 340 345 350 

Ala lie Gly Ser Asn Asn Thr Val Lys Lys Asn Gin Lys Asn Val Phe 
355 360 365 

20 He Leu Gly Ser Asn Thr Asp Thr Lys Asp Ala Gin Ser Gly Ser Val 

370 375 380 



Leu Leu Gly Asp Asn Thr Ser Gly Lys Ala Ala Thr Ala Val Glu Asp 
385 390 395 400 

Ala Thr Val Gly Asp Leu Ser Leu Thr Gly Phe Ala Gly Val Ser Lys 

405 410 415 



Ala Asn Ser Gly Thr Val Ser Val Gly Ser Glu Gly Lys Glu Arg Gin 

30 420 425 430 

He Val His Val Gly Ala Gly Arg lie Ser Asn Asp Ser Thr Asp Ala 

435 440 445 

35 Val Asn Gly Ser Gin Leu Tyr Ala Leu Ala Ala Ala Val Asp Asp Asn 

450 455 460 



Gin Tyr Asp He Glu Lys Asn Gin Asp Asp He Ala Lys Asn Gin Ala 

465 470 475 480 

Asp He Ala Lys Asn Gin Ala Asp He Gin Thr Leu Glu Asn Asp Val 

485 490 495 



Gly Lys Glu Leu Leu Asn Leu Ser Gly Arg Leu He Asp Gin Lys Ala 
45 500 505 510 

Asp He Asp Asn Asn He Asn His He Tyr Glu Leu Ala Gin Gin Gin 
515 520 525 

50 Asp Gin His Ser' Ser Asp He Lys "Thr Leu Lys Lys Asn Val Glu Glu 

530 535 540 



Gly Leu Leu Glu Leu Ser Gly His Leu He Asp Gin Lys Ala Asp Leu 

545 550 555 560 

Thr Lys Asp He Lys Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu 

565 570 575 
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Asp Leu Ser Gly Arg Leu He Asp Gin Lys Ala Asp He Ala Gin Asn 
580 585 590 

Gin Ala Asn lie Gin Asp Leu Ala Ala Tyr Asn Glu Leu Gin Asp Gin 
595 ' 600 605 

Tyr Ala Gin Lys Gin Thr Glu Ala He Asp Ala Leu Asn Lys Ala Ser 
610 615 620 

Ser Glu Asn Thr Gin Asn He Glu Asp Leu Ala Ala Tyr Asn Glu Leu 
625 630 635 640 

Gin Asp Ala Tyr Ala Lys Gin Gin Thr Glu Ala He Asp Ala Leu Asn 
645 650 655 

Lys Ala Ser Ser Glu Asn Thr Gin Asn He Ala Lys Asn Gin Ala Asp 
660 665 670 

He Ala Asn Asn He Asn Asn He Tyr Glu Leu Ala Gin Gin Gin Asp 
675 680 685 

Gin His Ser Ser Asp He Lys Thr Leu Ala Lys Ala Ser Ala Ala Asn 
690 695 700 

Thr Asp Arg He Ala Lys Asn Lys Ala Asp Ala Asp Ala Ser Phe Glu 
705 710 715 720 

Thr Leu Thr Lys Asn Gin Asn Thr Leu He Glu Lys Asp Lys Glu His 
725 730 735 

Asp Lys Leu He Thr Ala Asn Lys Thr Ala He Asp Ala Asn Lys Ala 
740 745 750 

Ser Ala Asp Thr Lys Phe Ala Ala Thr Ala Asp Ala He Thr Lys Asn 
755 760 765 

Gly Asn Ala He Thr Lys Asn Ala Lys Ser He Thr Asp Leu Gly Thr 
770 775 780 

Lys Val Asp Gly Phe Asp Gly Arg Val Thr Ala Leu Asp Thr Lys Val 
785 790 795 800 

Asn Ala Phe Asp Gly Arg He Thr Ala Leu Asp Ser Lys Val Glu Asn 
805 810 815 

Gly Met Ala Ala Gin Ala Ala Leu Ser Gly Leu Phe Gin Pro Tyr Ser 
820 825 830 

Val Gly Lys Phe Asn Ala Thr Ala Ala L-u Gly Gly Tyr Gly Ser Lys 
835 840 845 

Ser Ala Val Ala He Gly Ala Gly Tyr Arg Val Asn Pro Asn Leu Ala 
850 855 860 

Phe Lys Ala Gly Ala Ala He Asn Thr Ser Gly Asn Lys Lys Gly Ser 
865 870 875 880 
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Tyr Asn lie Gly Val Asn Tyr Glu Phe 
885 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4228 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCGCACCCT GACCGAGACG CTCCGCCAAA TCGATGCGTC GGTGTACTAT GCCCCGACCG 6 0 

AGCTATGCAC GGATAATGGT GCGATGATCG CCTATGCTGG CTTTTGTCGG CTAAGCCGTG 12 0 

GACAGTCGGA TGACTTGGTG GTTCGCTGTA TTCCCCGATG GGATATGACG ACGCTTGGTA 18 0 

TCGAATATGA TAATTAGGCT GTGGTATTTG AGTTTTGAGT AATGTACCTA CTACCACTAA 24 0 

TTTATCATAC AATACATAAA CATAAAAAAC ATCGGTATTG TTAAAAAACA ATACCCAAGT 3 00 

TAAAATAGCT CAATACTTTA CCATAGCACA AAGAAACTTG TGAACGAAAC ATTTAATAAT 360 

TGCCCAAAAT GTTACTGCAC ACACTTTGTA AAAGCAGGCT TGGGCAATGG CAAACAACGA 420 

T AC AAATG C A AAGGTTGCCA TCACTATTTT TCTGTGAAGC AACGAAGCAA CCAAAAAAGT 480 

AATGACATTA AAAAAACAAG C CAT TG AT AC AAACAGTAAA CAAATCTTAG GCTTTGTCTG 54 0 

TGGTAAAACA G AC AC T AAC A CCTTTAAACG ACTTTATCAG CAGTTAAATA CC CAT AG CAT 600 

TCAACTGTTT TTTAGTGACT ACTGGAAATC TTATCGTCAA GTCATTTTAA AGCCAAAACA 66 0 

TATAACAAGC AAAGCTCAAA CTTTTACCAT AGAGGGCTAT AATAGTCTCA TTAGGCATTT 72 0 

CATAGCAAGA TTTACAAGAA AGTCAAAGTG TTATTCTAAA TCCGAAAAAA TGATAGAAAA 780 

CACGTTGAAT TTATTATTTG CTAAGTGGAA TGGTAGCTTA AGATATGTAT TTTAATTTAA 84 0 

CAATGCCAAA AACATCAATT ACAGTAAGAT TTTAGGCGTT TTGCAGTTGC TACTTTAGTA 900 

AAGCTTTGTT ATACTAGCTG TTAGTATACT CAAGCTTGTT TGTGTTTGAG CTATATTTAT 960 

TTTATAG C AG TAGTTGGTTA TAAAATATAA ATAAAGCTAA GCTCGAGGGT TTGGTAATGG 10 20 

TTTTTTATGT TTATAATACC AACAGAGTCT ATACAGCTAA AATAGCTAAT ACCTTAGGTG 1080 

TATTACAAGT AAAAATCCTT TGGTTAATCA GGGGGTGTAT T AT ATG TATA TTTCCTTTGT 114 0 

ATTTGGTTAT AGCAATCCCT TGGTAAGAAA TCATATCTAT TTTTTATTGT TCAATTATTT 12 00 

AGGAGACTAA GGTGAACAAA ATTTATAAAG TGAAAAAAAA TGCCGCAGGT CACTTGGTGG 12 6 0 
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CATGTTCTGA ATTTGCCAAA GGCCATACCA AAAAGGCAGT TTTGGGCAGT TTATTGATTG 
TTGGGGCGTT GGGCATGGCA ACGACGGCGT CTGCACAGCC ATTAGTAAGT ACAAATAAGC 
CTAATCAGCA GGTAAAGGGT TATTGGTCTA TTATTGGTGC AGGTCGTCAT AATAACGTAG 
GTGGATCCGC TCATCACTCA GGGATTCTTG GTGGTTGGAA AAATACAGTC AATGGCTATA 
CCTCAGCCAT TGTAGGTGGT TATGGTAACG AAACTCAGGG TGATTATACA TTCGTCGGTG 
GTGGTTATAA AAACTTGGCA AAGGGTAATT ATACATTCGT CGGTGGTGGT TATAAAAACT 
TGGCAGAGGG TG ATAATG C A ACCATCGCTG GTGGTTTTGC AAACTTGGCA GAGGGTGATA 
15 ATGCAACCAT CGCTGGTGGT TTTGAAAACC GTGCAGAGGG TATCGACTCA GTAGTTTCTG 

GTGGTTATGC CAACCAAGCT ACAGGAGAAA GCTCAACCGT CGCAGGTGGT TCTAATAACC 
TAGCAGAGGG CAAAAGCTCA GCCATTGGTG GTGGCCGTCA AAATGAGGCG TCTGGTGACC 1860 
20 GATCTACTGT CTCAGGTGGT TATAATAACC TAGCAGAGGG CAAAAGCTCA GCCATTGGTG 192 0 

GCGGTGAGTT TAACTTAGCA TTAGGGAATA ACGCTACCAT TAGTGGTGGC CGTCAAAATG 
25 AGGCGTCTGG TGACCGATCT ACTGTCGCAG GTGGTGAACA AAAC C AAGCC ATAGGCAAGT 

ATTCTACCAT TAGTGGTGGC CGTCAAAATG AGGCGTCTGG TGACCGATCT ACTGTCGCAG 
GTGGTGAACA AAACCAAGCC ATAGGCAAGT ATTCTACCGT TAGTGGTGGC TATCGAAACC 2160 

30 

AAGCCACAGG TAAAGGTTCA TTTGCAGCAG GTATAGATAA CAAAGCCAAT GCCGACAACG 2220 
CCGTCGCTCT AGGTAACAAG AACAC CATCG AAGGTGAAAA CTCAGTAGCC ATCGGCTCTA 2280 
35 ATAATACCGT TAAAAAAAAT CAAAAAAATG TCTTTATTCT TGGCTCTAAC ACAGACACAA 2340 

AAGATGCACA AAGCGGCTCA GTACTGCTAG GTGATAATAC CTCTGGTAAA GCAGCGACCG 
CTGTTGAGGA TGCCACAGTG GGTGATCTAA GCCTAACAGG ATTTGCAGGC GTATCAAAAG 
CTAATAGTGG TACTGTATCT GTCGGTAGTG AGGGTAAAGA GCGTCAAATC GTTCATGTTG 
GTGCAGGTCG GATCAGTAAT GATTCAACAG ATGCTGTTAA TGGCTCACAG CTATATGCTT 2 580 
TGGCCGCAGC TGTTGATGAC AACCAATATG ACATTGAAAA AAACCAAGAT GACATTGCTA 264 0 
AAAACCAAGC TGACATTGCT AAAAACCAAG CTGACATCCA AACACTTGAA AACGATGTCG 
GAAAAGAACT ATTAAATCTA AGCGGTCGCC TCATTGATCA AAAAG C AG AT ATTGATAATA 
ACATCAACCA TATCTATGAG CTGGCACAAC AGCAAGATCA G CAT AG C TCT GATATCAAAA 2820 
CACTTAAAAA AAATGTCGAA GAAGGTTTGT TGGAGCTAAG CGGTCACCTC ATTGATCAAA 2 8 80 
55 AAGCAGATCT TACAAAAGAC ATCAAAGCAC TTGAAAGCAA TGTCGAAGAA GGTTTGTTGG 2 94 0 

ATCTAAGCGG TCGCCTCATT GATCAAAAAG CAGATATTGC TCAAAACCAA GCTAACATCC 3 0 00 
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AAGATTTGGC TGCTTACAAC GAGCTACAAG ACCAGTATGC TCAAAAGCAA ACCGAAGCGA 3 06 0 

TTGACGCTCT AAATAAAGCA AGCTCTGAGA ATACACAAAA CATCGAAGAT CTGGCCGCTT 312 0 

ACAACGAGCT ACAAGATGCC TATGCCAAAC AGCAAACCGA AGCCATTGAC GCTCTAAATA 3180 

AAGCAAGCTC TGAGAATACA CAAAACATTG CTAAAAACCA AGCGGATATT GCTAATAACA 324 0 

TCAACAATAT CTATGAGCTA GCACAACAGC AAGATCAGCA TAGCTCTGAT ATCAAAACCT 330 0 

TGGCAAAAGC AAGTGCTGCC AATACTGATC GTATTGCTAA AAACAAAGCC GATGCTGATG 336 0 

CAAGTTTTGA AACGCTCACC AAAAATCAAA ATACTTTGAT TGAAAAAGAT AAAGAGCATG 34 2 0 

ACAAATTAAT TACTGCAAAC AAAACTGCGA TTGATGCCAA T AAAG CAT C T GCGGATACCA 34 8 0 

AGTTTGCAGC GACAGCAGAC GCCATTACCA AAAATGGAAA TGCTATCACT AAAAACGCAA 3 54 0 

AATCTATCAC TGATTTGGGT ACTAAAGTGG ATGGTTTTGA CGGTCGTGTA ACTGCATTAG 36 0 0 

ACACCAAAGT CAATGCCTTT GATGGTCGTA TCACAGCTTT AGACAGTAAA GTTGAAAACG 36 6 0 

GTATGGCTGC CCAAGCTGCC CTAAGTGGTC TATTCCAGCC TTATAGCGTT GGTAAGTTTA 3 72 0 

ATGCGACCGC TGCACTTGGT GGCTATGGCT CAAAATCTGC GGTTGCTATC GGTGCTGGCT 3 780 

ATCGTGTGAA TCCAAATCTG GCGTTTAAAG CTGGTGCGGC GATTAATACC AGTGGCAATA 384 0 

AAAAAGGCTC TTATAACATC GGTGTGAATT ACGAGTTCTA ATTGTCTATC ATCACCAAAA 3 90 0 

AAAGCAGTCA GTTTACTGGC TGCTTTTTTA TGGGTTTTTG TGGCTTTTGG TTGTGAGTGA 3 96 0 

TGGATAAAAG CTTGTCAAGC GATTGATGAA TATCAATAAA TGATTGGTAA ATATCAATAA 4 02 0 

AGCGGTTTAG GGTTTTTGGA TATCTTTTAA TAAGTTTAAA AACCCCTGCA TAAAATAAAG 4 08 0 

CTGGCATCAG AGCTGCGAAG TAGCGGCATA CAGCTGGCAA TGCACGCCTG TGCCTAGGGG 414 0 

GCGTGAGACC ACCCAGCCTT TGCGTTCGTA TTCTAAAATT ACCCAATCAG GCAGAGCGGC 4200 

AACTCCATGT TCGGAGGCGA CCAGCTGA 422 8 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Gin Gin Gin Asp Gin His 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : linear 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
Tyr Glu Leu Ala Gin Gin Gin Asp Gin His 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(.xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Tyr Asp Leu Ala Gin Gin Gin Asp Gin His 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO.: 20: 
GACGCTCAAC AGCACTAATA CG 

(2) INFORMATION FOR SEQ ID NO : 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CCAAGCTGAT ATCACTACC 
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(2) INFORMATION FOR SEQ ID NO : 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 22: 
TCAATGCCTT TGATGGTC 18 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(AJ LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
TGTATGCCGC TACTCGCAGC T 21 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME / KEY : Modified- site 

(B) LOCATION: 2 . . 13 

(D) OTHER INFORMATION: /note= "X = any" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Asn Xaa Ala Xaa Xaa Tyr Ser Xaa lie Gly Gly Gly Xaa Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS:. 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 25: 
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Gin Ala Asp He 
1 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Ala Ala Gin Ala Ala Leu Ser Gly Leu Phe Val Pro Tyr Ser Val Gly 
! 5 10 15 

Lys Phe Asn Ala Thr Ala Ala Leu Gly Gly Tyr Gly Ser Lys 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Gly Lys He Thr Lys Asn Ala Ala Arg Gin Glu Asn Gly 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Val He Gly Asp Leu Gly Arg Lys Val 

1 5 - 



(2) INFORMATION FOR SEQ ID NO : 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
.(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : Modi f ied- s i te 

(B) LOCATION : 4 

(D) OTHER INFORMATION: /note= "X = any 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 29: 

Ala Leu Glu Xaa Asn Val Glu Glu Gly Leu 
15 10 



(2) INFORMATION FOR SEQ ID NO : 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modified- si te 

(B) LOCATION: 11 . . 12 

(D) OTHER INFORMATION: /note= "X = any" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 30: 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Xaa Xaa Leu Ser 
15 10 



(2) INFORMATION FOR SEQ ID NO : 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 31: 

Ala Leu Glu Phe Asn Gly Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO : 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(ix) FEATURE : 

(A) NAME /KEY : Modif ied- si te 

(B) LOCATION: 7 

(D) OTHER INFORMATION: /note= "X = any" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Ser lie Thr Asp Leu Gly Xaa Lys Val 
1 5 



(2) INFORMATION FOR SEQ ID NO : 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(ix) FEATURE: 

(A) NAME / KEY : Modi f ied- s ite 

(B) LOCATION: 13 . . 15 

(D) OTHER INFORMATION: /note= "X = any" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Ser lie Thr Asp Leu Gly Thr lie Val Asp Gly Phe Xaa Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
Ser lie Thr Asp Leu Gly Thr lie Val Asp 



(2) INFORMATION FOR SEQ ID NO : 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : . 

(D) TOPOLOGY:, linear 

(ix) FEATURE: 

(A) NAME / KEY : Modified- site 

(B) LOCATION: 5 . . 19 

(D) OTHER INFORMATION : / note= "X = any" 



1 



5 



10 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 35: 

Val Asp Ala Leu Xaa Thr Lys Val Asn Ala Leu Asp Xaa Lys Val Asn 
15 10 15 

Ser Asp Xaa Thr 
20 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Leu Leu Ala Glu Gin Gin Leu Asn Gly Lys Thr Leu Thr Pro Val 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 37: 

Ala Lys His Asp Ala Ala Ser Thr Glu Lys Gly Lys Met Asp 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly 
1 5 10 15 
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(2) INFORMATION FOR SEQ ID NO : 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Asn Gin Asn Thr Leu lie Glu Lys Thr Ala Asn Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : 

CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

lie Asp Lys Asn Glu Tyr Ser lie Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 41: 

Ser lie Thr Asp Leu Gly Thr Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

Asn Gin Asn Thr Leu lie Glu Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO: 43: 
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(i) SEQUENCE CHARACTERISTICS ; 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Ala Leu His Glu Gin Gin Leu Glu Thr Leu Thr Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 



(2) INFORMATION FOR SEQ ID NO : 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Asn Lys Ala Asp Ala Asp Ala Ser Phe Glu Thr Leu Thr Lys 
1 5 .10 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 46: 

Phe Ala Ala Thr Ala lie Ala Lys Asp Lyn 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 47: 



(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



Asn Ser Ser Asp 
1 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Lys Ala Ser Ser Glu Asn Thr Gin Asn lie Ala Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Arg Leu Leu Asp Gin Lys 
1 5 



(2) INFORMATION FOR SEQ ID NO : 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modified- site 

(B) LOCATION: 12 

(D) OTHER INFORMATION: /note= "X = any" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 49: 

Ala Ala Thr Ala Asp Ala lie Thr Lys Asn Gly Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : Modified- site 

(B) LOCATION : 4 . . 8 

(D) OTHER INFORMATION: /note= M X = any 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 50: 

Ala Lys Ala Xaa Ala Ala Asn Xaa Asp Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Asn Gin Ala Asp lie Ala Gin Asn Gin Thr Asp lie Gin Asp Leu Ala 
15 10 15 

Ala Tyr Asn Glu Leu Gin 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Asn Gin Ala Asp lie Ala Asn Asn lie Asn Asn lie Tyr Glu Leu Ala 
15 10 15 

Gin Gin Gin Asp Gin 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



20 



20 
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Tyr Asn Glu Arg Gin Thr Glu Ala He Asp Ala Leu Asn 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

He Leu Gly Asp Thr Ala He Val Ser Asn Ser Gin Asp 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:. 55: 

Lys Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly 
x 5 10 15 

Arg 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Glu Leu Ser Gly Arg 
1 5 10 15 

" Thr He Asp Gin Arg 
20 
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(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modified- site 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /note= "X = any" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Asn Gin Ala His lie Ala Asn Asn lie Asn Xaa lie Tyr Glu Leu Ala 
15 10 15 

Gin Gin Gin Asp Gin Lys 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Asn Gin Ala Asp lie Ala Gin Asn Gin Thr Asp lie Gin Asp Leu Ala 
15 10 15 

Ala Tyr Asn Glu Leu Gin 



(2) INFORMATION FOR SEQ ID NO : 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 59: 

Ala Thr His Asp Tyr Asn Glu Arg Gin Thr Glu Ala 
1 5 10. 



20 
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(2) INFORMATION FOR SEQ ID NO : 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 60: 

Lys Ala Ser Ser Glu Asn Thr Gin Asn lie Ala Lys 
1 5 10 



15 (2) INFORMATION FOR SEQ ID NO : 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 amino acids 

(B) TYPE: amino acid 
20 (C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

25 Met lie Leu Gly Asp Thr Ala He Val Ser Asn Ser Gin Asp Asn Lys 

5 10 15 



Thr Gin Leu Lys Phe Tyr Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 62: 



(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

40 (ix) FEATURE: 

(A) NAME /KEY : Modi f ied- s i te 

(B) LOCATION: 12 . . 13 

(D) OTHER INFORMATION: /note= "X = any" 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Ala Gly Asp Thr He lie Pro Leu Asp Asp Asp Xaa Xaa Pro 
1 5 10 

50 
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(2) INFORMATION FOR SEQ ID NO : 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modi f ied- si te 

(B) LOCATION: 8 

(D) OTHER INFORMATION: /note= "X = any" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Leu Leu His Glu Gin Gin Leu Xaa Gly Lys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION: 5 

(D) OTHER INFORMATION: /note= "X = any 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

lie Phe Phe Asn Xaa Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO : 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Asn Asn lie Asn Asn lie Tyr Glu Leu Ala Gin Gin Gin Asp Gin His 
15 10 15 

Ser Ser Asp lie Lys Thr Leu 



20 
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(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GGTGCAGGTC AGATCAGTGA C 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 67: 
GCCACCAACC AAGCTGAC 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
AGCGGTCGCC TGCTTGATCA G 

(2) INFORMATION FOR SEQ ID NO : 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
CTGATCAAGC AGGCGACCGC T 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CAAGATCTGG CCGCTTACAA 2 0 



(2) INFORMATION FOR SEQ ID NO : 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
TTGTAAGCGG CCAGATCTTG 2 0 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TGCATGAGCC GCAAACCC 18 



(2) INFORMATION FOR SEQ ID NO : 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Leu Leu Ala Glu Gin Gin Leu Asn Gly 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 78 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 ine a r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Thr His Glu Phe He Arg Ser Thr Ser Glu Gin Glu Asn Cys Glu Ser 
1 5 10 15 

Ala Arg Glu Asn Thr His Glu Asp He Ser Lys Glu Thr Thr Glu Ser 
20 25 30 

Ala Asn Asp Ala Thr Leu Glu Ala Ser Thr Ser Met Glu Phe Thr His 
35 40 45 

Glu Ser Glu Ala Leu Arg Glu Ala Asp Tyr Asn Thr His Glu Thr Asp 
50 55 60 

Arg He Val Glu Cys His Glu Cys Lys Trp He Thr His Gly Glu Arg 
65 70 75 80 

Arg He Thr His Glu Ser Glu Ser Glu Gin Glu Asn Cys Glu Ser Ala 
85 90 95 

Arg Glu Asn Ala Met Glu Asp Asn Thr His Glu Asp He Ser Lys Glu 
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100 105 110 

Thr Thr Glu Ser Ala Asn Asp His Ala Arg Asp Cys Pro lie Glu Ser 
115 120 125 

5 

Ala Thr Thr Ala Cys His Glu' Asp Ala Ser Phe Leu Leu Trp Ser Ala 
130 135 140 

Asn Asp Asp Asn Thr His Ala Val Glu Ala Asn Tyr Ser Pro Glu Cys 
10 145 150 155 160 

lie Ala Leu Cys His Ala Arg Ala Cys Thr Glu Arg Ser Asp Asn Thr 
165 170 175 

15 Thr Arg Ala Asn Ser Leu Ala Thr Glu Ala Asn Tyr Ser Glu Gin Glu 

180 185 190 



20 



35 



50 



Asn Cys Glu Ser Thr His Ala Thr lie Ser Thr His Glu Arg Glu lie 

195 200 205 

Ser Asn Ser Thr Ala Arg Thr Cys Asp Asn Ser Glu Gin lie Asp Asn 
210 215 220 



Phe lie Leu Glu Asn Ala Met Glu Thr Tyr Pro Glu Ser Thr Arg Ala 

25 225 230 235 240 

Asn Asp Thr Pro Leu Gly Tyr Ser Glu Gin lie Asp Asn Glu Ser Pro 
245 250 255 

30 Ala Ala Ala Pro Arg Thr Glu lie Asn Asn Ala Leu lie Asn Glu Ala 

260 265 270 



Arg Ser Glu Gin lie Asp Asn Glu Ser Pro Ala Asn Ala Asp Asn Ala 

275 280 285 

Asp Asx Leu Glu Leu lie Asn Glu Ala Arg Ser Glu Gin lie Asp Asn 
290 295 300 



Glu Ser Pro Ala Ala Ala Pro Arg Thr Glu lie Asn Asn Ala Leu lie 
40 305 310 315 320 

Asn Glu Ala Arg Ser Glu Gin lie Asp Asn Glu Ser Pro Ala Asn Ala 
325 330 335 

45 Asp Asn Ala Asp Asx Leu Glu Leu lie Asn Glu Ala Arg Ser Glu Gin 

340 345 350 



lie Asp Asn Glu Ser Pro Ala Ala Ala Pro Ala Thr Pro Arg Thr Glu 
355 - 360 365 

lie Asn Asn Ala Leu lie Asn Glu Ala Arg Ser Glu Gin lie Asp Asn 
370 375 380 



Glu Ser Pro Ala Asn Ala Pro Ala Thr Asp Asn Ala Asp Asx Leu Glu 
55 385 390 395 400 



SUBSTITUTE SHEET (RULE 25) 

BNSDOCID <WO 9828333A2 I > 



WO 98/28333 W PCT/US97/23930 

191 

Leu He Asn Glu Ala Arg Ser Glu Gin He Asp Asn Glu Ser Pro Ala 
405 410 415 

Ala Ala Pro Ala Thr Pro Arg Thr Glu He Asn Asn Ala Leu He Asn 
420 425 430 

Glu Ala Arg Ser Glu Gin He Asp Asn Glu Ser Pro Ala Asn Ala Pro 
435 440 445 

Ala Thr Asp Asn Ala Asp Asx Leu Glu Leu He Asn Glu Ala Arg Ser 
450 455 460 

Glu Gin He Asp Asn Thr Thr Ala Ser Pro Ala Ala Ala Pro Ala Thr 
465 470 475 480 

Pro Arg Thr Glu He Asn Asn Ala Leu He Asn Glu Ala Arg Ser Glu 
485 490 495 

Gin He Asp Asn Thr Thr Ala Ser Pro Ala Asn Ala Pro Ala Thr Asp 
500 505 510 

Asn Ala Asp Asx Leu Glu Leu He Asn Glu Ala Arg Ser- Glu Gin He 
515 520 525 

Asp Asn Thr Thr Ala Ser Pro Ala Ala Ala Pro Ala Thr Pro Arg Thr 
530 535 540 

Glu He Asn Asn Ala Leu He Asn Glu Ala Arg Ser Glu Gin He Asp 
545 550 555 560 

Asn Thr Thr Ala Ser Pro Ala Asn Ala Pro Ala Thr Asp Asn Ala Asp 
565 570 575 

Asx Leu Glu Leu He Asn Glu Ala Arg Ser Glu Gin He Asp Asn Thr 
580 585 590 

Thr Ala Ser Pro Ala Ala Ala Pro Ala Thr Pro Arg Thr Glu He Asn 
595 600 605 

Asn Ala Leu He Asn Glu Ala Arg Ser Glu Gin He Asp Asn Thr Thr 
610 615 620 

Ala Ser Pro Ala Asn Ala Pro Ala Thr Asp Asn Ala Asp Asx Leu Glu 
625 630 635 640 

Leu He Asn Glu Ala Arg Ser Glu Gin He Asp Asn Thr Thr Ala Ser 
645 650 655 

Pro Ala Ala Ala Pro Ala Thr Pro Arg Thr Glu He Asn Asn Ala Leu 
660 665 670 

He Asn Glu Ala Arg Ser Glu Gin He Asp Asn Thr Thr Ala Ser Pro 
675 6.80 685 

Ala Asn Ala Pro Ala Thr Asp Asn Ala Asp Asx Leu Glu Leu He Asn 
690 695 700 
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Glu Ala Arg Ser Glu Gin lie Asp Asn Thr His Arg Gly His Ser Glu 
705 710 715 720 

Gin lie Asp Asn lie Ser Gly lie Val Glu Asn Asx Glu Leu Trp Ala 
5 725 730 735 

Asn Asp Ala Arg Glu Asn Thr Asn Thr His Glu Asp lie Ser Lys Glu 
740 745 750 

10 Thr Thr Glu Ser Ser Glu Gin Glu Asn Cys Glu Ser Glu Gin lie Asp 

755 760 765 



Asn Thr Tyr Pro Glu Thr Pro Leu Gly Tyr Ser Thr Arg Ala Asn Asp 
770 775 780 

Ser Pro Glu Cys lie Ala Leu Ala Gin Gin Gin Asp Gin His Ser Glu 
785 790 795 800 



Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg 
20 805 810 815 

Asn Ala Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser Glu Gin lie 
820 825 830 

25 Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala 

835 840 845 

Tyr Asp Leu Ala Gin Gin Gin Asp Gin His Ser Glu Gin lie Asp Asn 
850 855 860 



Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala Gly Ala 
865 870 875 880 



Cys Gly Cys Thr Cys Ala Ala Cys Ala Gly Cys Ala Cys Thr Ala Ala 
35 885 890 895 

Thr Ala Cys Gly Ser Glu Gin lie Asp Asn Asp Asn Ala Leu lie Asn 
900 905 910 

40 Glu Ala Arg Asp Asx Leu Glu Cys Cys Ala Ala Gly Cys Thr Gly Ala 

915 920 925 

Thr Ala Thr .Cys Ala Cys Thr Ala Cys Cys Ser Glu Gin lie Asp Asn 
930 935 940 



Asp Asn Ala Leu lie Asn Glu Ala Arg Asp Asx Leu Glu Thr Cys Ala 
945 950 955 960 



Ala Thr Gly Cys Cys Thr Thr Thr Gly Ala Thr Gly Gly Thr Cys Ser 
50 965 970 975 

Glu Gin lie Asp Asn Asp Asn Ala Leu lie Asn Glu Ala Arg Asp Asx 
980 985 990 

55 Leu Glu Thr Gly Thr Ala Thr Gly Cys Cys Gly Cys Thr Ala Cys Thr 

995 1000 1005 
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Cys Gly Cys Ala Gly Cys Thr Ser Glu Gin lie Asp Asn Asp Asn Ala 
1010 1015 1020 

Leu lie Asn Glu Ala Arg Asp Asx Leu Glu Asn Xaa Ala Xaa Xaa Tyr 
1025 1030 1035 1040 

Ser Xaa lie Gly Gly Gly Xaa Asn Ser Glu Gin lie Asp Asn Pro Arg 
1045 1050 1055 

Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala Xaa Ala Asn Tyr 
1060 1065 1070 

Ala Thr Pro Ser lie Thr lie Asn Ser Gin Ala Asp lie Ser Glu Gin 
1075 1080 1085 

lie Asp Asn Pro Arg Thr Glu lie Asn Leu He Asn Glu Ala Arg Asn 
1090 1095 1100 

Ala Ala Ala Gin Ala Ala Leu Ser Gly Leu Phe Val Pro Tyr Ser Val 
1105 1110 1115 1120 

Gly Lys Phe Asn Ala Thr Ala Ala Leu Gly Gly Tyr Gly Ser Lys Ser 
1125 1130 1135 

Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala 
1140 1145 1150 

Arg Asn Ala Gly Lys He Thr Lys Asn Ala Ala Arg Gin Glu Asn Gly 
1155 1160 1165 

Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu lie Asn Glu 
1170 1175 1180 

Ala Arg Asn Ala Val He Gly Asp Leu Gly Arg Lys Val Ser Glu Gin 
1185 1190 1195 1200 

He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn 
1205 1210 1215 

Ala Ala Leu Glu Xaa Asn Val Glu Glu Gly Leu Ser Glu Gin He Asp 
1220 1225 1230 

Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Xaa 
1235 1240 1245 

Ala Asn Tyr Ala Thr Pro Ser He Thr He Asn Ala Leu Glu Ser Asn 
1250 1255 1260 

Val Glu Glu Gly Leu Xaa Xaa Leu Ser Ser Glu Gin He Asp Asn Pro 
1265 1270 1275 1280 

Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Xaa Ala Asn 
1285 1290 1295 

Tyr Ala Thr Pro Ser He Thr He Asn Ser Ala Leu Glu Phe Asn Gly 
1300 1305 1310 
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Glu Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn 
1315 1320 1325 

Glu Ala Arg Asn Ala Ser lie Thr Asp Leu Gly Xaa Lys Val Ser Glu 
1330 1335 1340 

Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg 
1345 1350 1355 1360 

Asn Ala Xaa Ala Asn Tyr Ala Thr Pro Ser lie Thr lie Asn Ser lie 
1365 1370 1375 

Thr Asp Leu Gly Thr lie Val Asp Gly Phe Xaa Xaa Xaa Ser Glu Gin 
1380 1385 1390 

lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn 
1395 1400 1405 

Ala Xaa Ala Asn Tyr Ala Thr Pro Ser lie Thr lie Asn Ser Ser lie 
1410 1415 1420 

Thr Asp Leu Gly Thr lie Val Asp Ser Glu Gin lie Asp Asn Pro Arg 
1425 1430 1435 1440 

Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala Val Asp Ala Leu 
1445 1450 1455 

Xaa Thr Lys Val Asn Ala Leu Asp Xaa Lys Val Asn Ser Asp Xaa Thr 
1460 1465 14 70 

Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu 
1475 1480 1485 

Ala Arg Asn Ala Xaa Ala Asn Tyr Ala Thr Pro Ser- lie Thr lie Asn 
1490 1495 1500 

Ser Leu Leu Ala Glu Gin Gin Leu Asn Gly Lys Thr Leu Thr Pro Val 
1505 1510 1515 1520 

Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu 
1525 1530 1535 

Ala Arg Asn Ala Ala Lys His Asp Ala Ala Ser Thr Glu Lys Gly Lys 
1540 1545 1550 

Met Asp Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie 
1555 1560 1565 

Asn Glu Ala Arg Asn Ala Ala Leu Glu Ser Asn Val Glu Glu Gly Leu 
1570 1575 1580 

Leu Asp Leu Ser Gly Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie 
1585 1590 1595 1600 

Asn Leu lie Asn Glu Ala Arg Asn Ala Asn Gin Asn Thr Leu lie Glu 
1605 1610 1615 
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Lys Thr Ala Asn Lys Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie 
1620 1625 1630 

Asn Leu lie Asn Glu Ala Arg Asn Ala lie Asp Lys Asn Glu Tyr Ser 
1635 1640 1645 

He Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
1650 1655 1660 

Asn Glu Ala Arg Asn Ala Ser He Thr Asp Leu Gly Thr Lys Ser Glu 
1665 1670 1675 1680 

Gin lie Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg 
1685 1690 1695 

Asn Ala Asn Gin Asn Thr Leu He Glu Lys Ser Glu Gin He Asp Asn 
1700 1705 1710 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala Leu 
1715 1720 1725 

His Glu Gin Gin Leu Glu Thr Leu Thr Lys Ser Glu Gin He Asp Asn 
1730 1735 1740 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Asn Ser 
1745 1750 1755 1760 

Ser Asp Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
1765 1770 1775 

Asn Glu Ala Arg Asn Ala Asn Lys Ala Asp Ala Asp Ala Ser Phe Glu 
1780 1785 1790 

Thr Leu Thr Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn 
1795 1800 1805 

Leu He Asn Glu Ala Arg Asn Ala Phe Ala Ala Thr Ala He Ala Lys 
1810 1815 1820 

Asp Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
1825 1830 1835 1840 

Asn Glu Ala Arg Asn Ala Lys Ala Ser Ser Glu Asn Thr Gin Asn He 
1845 1850 1855 

Ala Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
1860 1865 1870 

Asn Glu Ala Arg Asn Ala Arg Leu Leu Asp Gin Lys Ser Glu Gin He 
1875 1880 1885 

Asp Asn Pro Arg Thr Glu lie Asn Leu He Asn Glu Ala Arg Asn Ala 
1890 1895 1900 

Ala Ala Thr Ala Asp Ala lie Thr Lys Asn Gly Xaa Ser Glu Gin He 
1905 1910 1915 1920 
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Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala 
1925 1930 1935 

Ala Lys Ala Xaa Ala Ala Asn Xaa Asp Arg Ser Glu Gin lie Asp Asn 
1940 1945 1950 

Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala Xaa Ala 
1955 1960 1965 

Asn Tyr Ala Thr Pro Ser lie Thr lie Asn Ser Asn Gin Ala Asp lie 
1970 1975 1980 

Ala Gin Asn Gin Thr Asp lie Gin Asp Leu Ala Ala Tyr Asn Glu Leu 
1985 1990 1995 2000 

Gin Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn 
2005 2010 2015 

Glu Ala Arg Asn Ala Asn Gin Ala Asp lie Ala Asn Asn lie Asn Asn 
2020 2025 2030 

lie Tyr Glu Leu Ala Gin Gin Gin Asp Gin Ser Glu Gin lie Asp Asn 
2035 2040 2045 

Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala Tyr Asn 
2050 2055 2060 

Glu Arg Gin Thr Glu Ala lie Asp Ala Leu Asn Ser Glu Gin He Asp 
2065 2070 2075 2080 

Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala He 
2085 2090 2095 

Leu Gly Asp Thr Ala He Val Ser Asn Ser Gin Asp Ser Glu Gin He 
2100 2105 2110 

Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala 
2115 2120 2125 

Lys Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly 
2130 2135 2140 

Arg Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn 
2145 2150 2155 2160 

Glu Ala Arg Asn Ala Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Leu 
2165 2170 2175 

Glu Leu Ser Gly Arg Thr He Asp Gin Arg Ser Glu Gin He Asp Asn 
2180 2185 2190 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Asn Gin 
2195 2200 2205 

Ala His He Ala Asn Asn He Asn Xaa He Tyr Glu Leu Ala Gin Gin 
2210 2215 2220 
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Gin Asp Gin Lys Ser Glu Gin lie Asp Asn Pro Arg Thr Glu He Asn 
2225 2230 2235 2240 

Leu He Asn Glu Ala Arg Asn Ala Xaa Ala Asn Tyr Ala Thr Pro Ser 
2245 2250 2255 

He Thr He Asn Asn Gin Ala Asp He Ala Gin Asn Gin Thr Asp He 
2260 2265 2270 

Gin Asp Leu Ala Ala Tyr Asn Glu Leu Gin Ser Glu Gin He Asp Asn 
2275 2280 2285 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala Thr 
2290 2295 2300 

His Asp Tyr Asn Glu Arg Gin Thr Glu Ala Ser Glu Gin He Asp Asn 
2305 2310 2315 2320 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Lys Ala 
2325 2330 2335 

Ser Ser Glu Asn Thr Gin Asn He Ala Lys Ser Glu Gin He Asp Asn 
2340 2345 2350 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Met He 
2355 2360 2365 

Leu Gly Asp Thr Ala He Val Ser Asn Ser Gin Asp Asn Lys Thr Gin 
2370 2375 2380 

Leu Lys Phe Tyr Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He 
2385 2390 2395 2400 

Asn Leu He Asn Glu Ala Arg Asn Ala Ala Gly Asp Thr He He Pro 
2405 2410 2415 

Leu Asp Asp Asp Xaa Xaa Pro Ser Glu Gin He Asp Asn Pro Arg Thr 
2420 2425 2430 

' Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Xaa Ala . Asn Tyr Ala 
2435 2440 2445 

Thr Pro Ser He Thr He Asn Ser Leu Leu His Glu Gin Gin Leu Xaa 
2450 2455 2460 

Gly Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
2465 2470 2475 2480 

Asn Glu Ala Arg Asn Ala Xaa Ala Asn Tyr Ala Thr Pro Ser He Thr 
2485 2490 . 2495 

He Asn He Phe Phe Asn Xaa Gly Ser Glu Gin He Asp Asn Pro Arg 
2500 2505 2510 

Thr Glu He Asn Leu lie Asn Glu Ala Arg Asn Ala Xaa Ala Asn Tyr 
2515 2520' 2525 
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Ala Thr Pro Ser lie Thr lie Asn Asn Asn lie Asn Asn lie Tyr Glu 
2530 2535 2540 

Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp lie Lys Thr Leu Ser 
2545 2550 2555 2560 

Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala 
2565 2570 2575 

Arg Asn Ala Gly Gly Thr Gly Cys Ala Gly Gly Thr Cys Ala Gly Ala 
2580 2585 2590 

Thr Cys Ala Gly Thr Gly Ala Cys Ser Glu Gin He Asp Asn Asp Asn 
2595 2600 2605 

Ala Leu He Asn Glu Ala Arg Asp Asx Leu Glu Gly Cys Cys Ala Cys 
2610 2615 2620 

Cys Ala Ala Cys Cys Ala Ala Gly Cys Thr Gly Ala Cys Ser Glu Gin 
2625 2630 2635 2640 

He Asp Asn Asp Asn Ala Leu He Asn Glu Ala Arg Asp Asx Leu Glu 
2645 2650 2655 

Ala Gly Cys Gly Gly Thr Cys Gly Cys Cys Thr Gly Cys Thr Thr Gly 
2660 2665 2670 

Ala Thr Cys Ala Gly Ser Glu Gin He Asp Asn Asp Asn Ala Leu He 
2675 2680 - 2685 

Asn Glu Ala Arg Asp Asx Leu Glu Cys Thr Gly Ala Thr Cys Ala Ala 
2690 2695 2700 

Gly Cys Ala Gly Gly Cys Gly Ala Cys Cys Gly Cys Thr Ser Glu Gin 
2705 2710 2715 2720 

He Asp Asn Asp Asn Ala Leu He Asn Glu Ala Arg Asp Asx Leu Glu 
2725 2730 2735 

Cys Ala Ala Gly Ala Thr Cys Thr Gly Gly Cys Cys Gly Cys Thr Thr 
2740 2745 2750 

Ala Cys Ala Ala Ser Glu Gin He Asp Asn Asp Asn Ala Leu He Asn 
2755 2760 2765 

Glu Ala Arg Asp Asx Leu Glu Thr Thr Gly Thr Ala Ala Gly Cys Gly 
2770 2775 2780 

Gly Cys Cys Ala Gly Ala Thr Cys Thr Thr Gly Ser Glu Gin He Asp 
2785 2790 2795 2800 

Asn Asp Asn Ala Leu He Asn Glu Ala Arg Asp Asx Leu Glu Thr Gly 
2805 2810 2815 

Cys Ala Thr Gly Ala Gly Cys Cys Gly Cys Ala Ala Ala Cys Cys Cys 
2820 2825 2830 
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Ser Glu Gin He Asp Asn Asp Asn Ala Leu He Asn Glu Ala Arg Asp 
2835 2840 2845 

Asx Leu Glu Leu Leu Ala Glu Gin Gin Leu Asn Gly Ser Glu Gin He 



2850 



2855 2860 



Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala 
2865 2870 2875 2880 

Ala Leu Glu Ser Asn Val Glu Glu Gly Leu Ser Glu Gin He Asp Asn 
2885 2890 2895 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala Leu 
2900 2905 2910 

Glu Ser Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Ser Glu Gin He 
2915 2920 2925 

Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala 
2930 2935 2940 

Asn Ala Lys Ala Ser Ala Ala Asn Thr Asp Arg Ser Glu Gin He Asp 
2945 2950 2955 2960 

Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala 
2965 2970 2975 

Ala Thr Ala Ala Asp Ala He Thr Lys Asn Gly Asn Ser Glu Gin He 
2980 2985 2990 

Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala 
2995 3000 3005 

Ser He Thr Asp Leu Gly Thr Lys Val Asp Gly Phe Asp Gly Arg Ser 
3010 3015 3020 

Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala 
3025 3030 3035 3040 

Arg Asn Ala Val Asp Ala Leu Xaa Thr Lys Val Asn Ala Leu Asp Xaa 
3045 3050 3055 

Lys Val Asn Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu 
3060 3065 3070 

He Asn Glu Ala Arg Asn Ala Xaa Ala Asn Tyr Ala Thr Pro Ser He 
3075 3080 3085 

Thr He Asn Ser Ala Ala Gin Ala Ala Lou Ser Gly Leu Phe Val Pro 
3090 ' 3095 3100 

Tyr Ser Val Gly Lys Phe Asn Ala Thr Ala Ala Leu Gly Gly Tyr Gly 
3105 3110 3115 3120 

Ser Lys Ser Glu Gin He Asp Asn Pro Arg Thr Glu lie Asn Leu lie 
3125 3130 3135 
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Asn Glu Ala Arg Asn Ala Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp 
3140 3145 3150 

Ser Glu Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu 
3155 3160 3165 

Ala Arg Asn Ala Gin Lys Ala Asp lie Asp Asn Asn lie Asn Ser Glu 
3170 3175 3180 

Gin lie Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg 
3185 3190 3195 3200 

Asn Ala Asn Asn lie Asn Asn lie Tyr Glu Leu Ala Ser Glu Gin lie 
3205 3210 3215 

Asp Asn Pro Arg Thr Glu lie Asn Leu lie Asn Glu Ala Arg Asn Ala 
3220 3225 3230 

Asn Asn He Tyr Glu Leu Ala Gin Gin Gin Ser Glu Gin He Asp Asn 
3235 3240 3245 

Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala Gin 
3250 3255 3260 

Gin Gin. Asp Gin His Ser Ser Asp Ser Glu Gin He Asp Asn Pro Arg 
3265 3270 3275 3280 

Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Gin Asp Gin His 
3285 3290 3295 

Ser Ser Asp He Lys Thr Ser Glu Gin He Asp Asn Pro Arg Thr Glu 
3300 3305 3310 

He Asn Leu He Asn Glu Ala Arg Asn Ala His Ser Ser Asp He Lys 
3315 3320 3325 

Thr Leu Lys Asn Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn 
3330 3335 3340 

Leu He Asn Glu Ala Arg Asn Ala Asp He Lys Thr Leu Lys Asn Asn 
3345 3350 3355 3360 

Val Glu Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He 
3365 3370 3375 

Asn Glu Ala Arg Asn Ala Thr Leu Lys Asn Asn Val Glu Glu Gly Leu 
3380 3385 3390 

Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu 
3395 3400 3405 

Ala Arg Asn Ala Glu Glu Gly Leu Leu Acp Leu Ser Gly Arg Ser Glu 
3410 3415 3420 

Gin He Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg 
3425 3430 3435 3440 
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Asn Ala Leu Ser Gly Arg Leu He Asp Gin Lys Ala Ser Glu Gin He 

3445 3450 3455 

Asp Asn Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala 

5 3460 3465 3470 

Asp Gin Lys Ala Asp He Ala Lys Asn Gin Ser Glu Gin He Asp Asn 

3475 3480 3485 

10 Pro Arg Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala Ala Lys 

3490 3495 3500 



Asn Gin Ala Asp He Ala Gin Asn Ser Glu Gin He Asp Asn Pro Arg 
3505 3510 3515 3520 

Thr Glu He Asn Leu He Asn Glu Ala Arg Asn Ala He Ala Gin Asn 
3525 3530 3535 



Gin Thr Asp He Gin Asp Ser Glu Gin He Asp Asn Pro Arg Thr Glu 
20 3540 3545 3550 

He Asn Leu He Asn Glu Ala Arg Asn Ala Asp He Gin Asp Leu Ala 
3555 3560 3565 

25 Ala Tyr Asn Glu Ser Glu Gin He Asp Asn Pro Arg Thr Glu He Asn 

3570 3575 3580 



Leu He Asn Glu Ala Arg Asn Ala Cys Gly Gly Gly Ala Thr Cys Cys 
3585 3590 3595 3600 

Gly Thr Gly Ala Ala Gly Ala Ala Ala Ala Ala Thr Gly Cys Cys Gly 
3605 3610 3615 



Cys Ala Gly Gly Thr Ser Glu Gin lie Asp Asn Asp Asn Ala Leu He 
35 3620 3625 3630 

Asn Glu Ala Arg Asp Asx Leu Glu Cys Gly Gly Gly Ala Thr Cys Cys 
3635 3640 3645 

40 Cys Gly Thr Cys Gly Cys Ala Ala.Gly Cys Cys Gly Ala Thr Thr Gly 

3650 3655 3660 



Ser Glu Gin He Asp Asn Asp Asn Ala Leu He Asn Glu Ala Arg Asp 
3665 3670 3675 3680 

Asx Leu Glu Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp He Asp Asn 

3685 3690 3695 



Asn He Asn Asn He Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser 
50 3700 3705 3710 

Ser Asp He Lys Thr Leu Lys Asn Asn Val Glu Glu Gly Leu Leu Asp 
3715 3720 3725 

55 Leu Ser Gly Arg Leu He Asp Gin Lys Ala Asp lie Ala Lys Asn Gin 

3730 3735 3740 
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Ala Asp lie Ala Gin Asn Gin Thr 
3745 3750 

Asn Glu Ser Glu Gin lie Asp Asn 
3765 

Asn Glu Ala Arg Asn Ala Ala Trp 
3780 
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Asp lie Gin Asp Leu Ala Ala Tyr 
3755 3760 

Pro Arg Thr Glu lie Asn Leu lie 
3770 3775 

Glx Xaa Asp Cys 
3785 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Ala Ala Thr Ala Ala Asp Ala lie Thr Lys Asn Gly Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Ser lie Thr Asp Leu Gly Thr Lys Val Asp Gly Phe Asp Gly Arg 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Modif ied-site 

(B) LOCATION : 5 . . 13 

(D) OTHER INFORMATION: /not e= "X = any" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 79: 

Val Asp Ala Leu Xaa Thr Lys Val Asn Ala Leu Asp Xaa Lys Val Asn 
1 5 10 15 
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(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Ala Ala Gin Ala Ala Leu Ser Gly Leu Phe Val Pro Tyr Ser Val Gly 
15 10 15 

Lys Phe Asn Ala Thr Ala Ala Leu Gly Gly Tyr Gly Ser Lys 
- 20 25 30 



(2) INFORMATION FOR SEQ ID NO : 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Gin Lys Ala Asp lie Asp Asn Asn lie Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO : 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Asn Asn lie Asn Asn lie Tyr Glu Leu Ala 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 84: 

Asn Asn lie Tyr Glu Leu Ala Gin Gin Gin 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Ala Gin Gin Gin Asp Gin His Ser Ser Asp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Gin Asp Gin His Ser Ser Asp lie Lys Thr 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

His Ser Ser Asp lie Lys Thr Leu Lys Asn 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO : 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
Asp He Lys Thr Leu Lys Asn Asn Val Glu 



(2) INFORMATION FOR SEQ ID NO : 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 89: 

Thr Leu Lys Asn Asn Val Glu Glu Gly Leu 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Glu Glu Gly Leu Leu Asp Leu Ser Gly Arg 
1 5 10 



(2) INFORMATION FOR SEQ ID NO : 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Leu Ser Gly Arg Leu He Asp Gin Lys Ala 
1 5 - 10 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Asp Gin Lys Ala Asp lie Ala Lys Asn Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Ala Lys Asn Gin Ala Asp lie Ala Gin Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

lie Ala Gin Asn Gin Thr Asp lie Gin Asp 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Asp lie Gin Asp Leu Ala Ala Tyr Asn Glu 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 96: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
CGGGATCCGT GAAGAAAAAT GCCGCAGGT 29 

(2) INFORMATION FOR SEQ ID NO : 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
CGGGATCCCG TCGCAAGCCG ATTG 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Ser Gly Arg Leu Leu Asp Gin Lys Ala Asp He Asp Asn Asn He Asn 
1 5 10 15 

Asn He Tyr Glu Leu Ala Gin Gin Gin Asp Gin His Ser Ser Asp He 
20 25 30 

Lys Thr Leu Lys Asn Asn Val Glu Glu Gly Leu Leu Asp Leu Ser Gly 
35 40 45 

Arg Leu He Asp Gin Lys Ala Asp He Ala Lys Asn Gin Ala Asp He 
50 55 60 

Ala Gin Asn Gin Thr Asp He Gin Asp Leu Ala Ala Tyr Asn Glu. 
65 70 75 



24 
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CLAIMS 



1. An isolated peptide of about 7 to about 60 amino acids comprising the amino acid 
sequence AQQQDQH (SEQ ID NO: 17). 



2. The isolated peptide of claim 1, wherein said peptide is about 10 amino acids in length. 



3. The isolated peptide of claim 1, wherein said peptide is about 20 amino acids in length. 



4. The isolated peptide of claim 1, wherein said peptide is about 30 amino acids in length. 



5. The isolated peptide of claim 1, wherein said peptide is about 40 amino acids in length. 



6. The isolated peptide of claim 1, wherein said peptide is about 50 amino acids in length. 



7. The isolated peptide of claim 1, wherein said peptide is about 60 amino acids in length. 



8. The isolated peptide of claim 1, wherein said peptide is at least about 16 consecutive 
residues of the amino acid sequence YELAQQQDQH (SEQ ID NO: 1 8). 
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9. An antigenic composition comprising (a) an isolated peptide of about 7 to about 60 
amino acids comprising the amino acid sequence AQQQDQH (SEQ ID NO: 17) and (b) 
a pharmaceutical^ acceptable buffer or diluent. 



10. The antigenic composition of claim 9, wherein said antigenic composition further 
comprises a carrier conjugated to said peptide. 



11. The antigenic composition of claim 10, wherein said carrier is KLH, diphtheria toxoid, 
tetanus toxoid or CRM 197 . 



12. The antigenic composition of claim 9, further comprising an adjuvant. 



1 3. The antigenic composition of claim 1 2, wherein said adjuvant comprises a lipid. 



14. The antigenic composition of claim 9 wherein said peptide is covalently linked to a 
second antigen. 



15. The antigenic composition of claim 14, wherein said second antigen is a peptide antigen. 

16. The antigenic composition of claim 14, wherein said second antigen is a non-peptide 
antigen. 
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17. The antigenic composition of claim 9, wherein said isolated peptide comprises at least 
about 16 consecutive residues of the amino acid sequence YELAQQQDQH (SEQ ID 



18. A vaccine composition comprising an isolated peptide of about 7 to about 60 amino 
acids comprising the amino acid sequence AQQQDQH (SEQ ID NO: 17) and a 
pharmaceutical ly acceptable buffer or diluent. 

19. The vaccine composition of claim 18, wherein said isolated peptide is further defined as 
comprising at least about 16 consecutive residues of the amino acid sequence 
YELAQQQDQH (SEQ ID NO: 18). 

20. A method for inducing an immune response in a mammal comprising the step of 
providing to said mammal an antigenic composition comprising (a) an isolated peptide 
of about 7 to about 60 amino acids comprising the amino acid sequence AQQQDQH 
(SEQ ID NO: 17) and (b) a pharmaceutically acceptable buffer or diluent. 

21. The method of claim 20, wherein said isolated peptide is further defined as comprising 
at least about 16 consecutive residues of the amino acid sequence YELAQQQDQH 
(SEQ ID NO: 18). 

22. A nucleic acid encoding the UspAl antigen ( SEQ ID NO:l) of the M catarrhalis isolate 



NO: 18). 



035E. 
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23. A nucleic acid having the nspAl DNA sequence (SEQ ID NO:2) of the M. catarrhalis 
isolate 035E. 



24. A nucleic acid encoding the UspA2 antigen (SEQ ID NO:3) of the M. catarrhalis isolate 
035E. 

25. A nucleic acid having the uspA2 DNA sequence (SEQ ID NO:4) of the M. catarrhalis 
isolate 035E. 

26. A nucleic acid encoding the UspAl antigen (SEQ ID NO:5) of the M catarrhalis isolate 
046E. 



27. A nucleic acid having the uspAl DNA sequence (SEQ ID NO:6) of the M. catarrhalis 
isolate 046E. 



28. A nucleic acid encoding the UspA2 antigen (SEQ ID NO:7) of the M. catarrhalis isolate 
046E. 



29. A nucleic acid having the uspA2 DNA sequence (SEQ ID NO:8) of the M. catarrhalis 
isolate 046E. 

30. A nucleic, acid encoding the UspAl antigen (SEQ ID NO:9) of the M. catarrhalis isolate 
TTA24. 
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31. A nucleic acid having the uspAl DNA sequence (SEQ ID NO: 10) of the M catarrhalis 
isolate TTA24. 



A nucleic acid encoding the UspA2 antigen (SEQ ID NO: 11) of the M. catarrhalis 
isolate TTA24. 



33. A nucleic acid having the uspA2 DNA sequence (SEQ ID NO: 12) of the M. catarrhalis 
isolate TTA24. 



34. A nucleic acid encoding the UspAl antigen (SEQ ID NO:13) of the M catarrhalis 
isolate TTA37. 



35. A nucleic acid having the uspAl DNA sequence (SEQ ID NO: 14) of the M catarrhalis 
isolate TTA37. 



36. A nucleic acid encoding the UspA2 antigen (SEQ ID NO: 15) of the M catarrhalis 
isolate TTA37. 



37. A nucleic acid having the uspA2 DNA sequence (SEQ ID NO: 16) of the M catarrhalis 
isolate TTA37. 
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38. A method for diagnosing M. catarrhalis infection comprising the step of determining 
the presence, in a sample, of an M catarrhalis amino acid sequence corresponding to 
residues of epitopic core sequences of said UspAl or UspA2 antigen. 

39. The method of claim 38, wherein said determining comprises PCR. 

40. The method of claim 38, wherein said determining comprises immunologic reactivity of 
an antibody to an M. catarrhalis antigen. 

41. A method for treating an individual having an M catarrhalis infection comprising 
providing to said individual an isolated peptide of about 7 to about 60 amino acids 
comprising the amino acid sequence AQQQDQH (SEQ ID NO:17). 

42. The isolated peptide of claim 41, wherein the said peptide comprises at least about 16 
consecutive residues of the amino acid sequence YELAQQQDQH (SEQ ID NO: 1 8). 

43. A method for preventing or limiting an M. catarrhalis infection comprising providing to 
a subject an antibody that reacts immunologically with an epitope formed by the amino 
acid sequence AQQQDQH (SEQ ID NO: 1 7). 

44. The method of claim 42, wherein said epitope is further defined as comprising at least 
about 16 consecutive residues of the amino acid sequence YELAQQQDQH (SEQ ID 
NO:18). 
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45. A method for screening a peptide for reactivity with an antibody that bind 
immunologically to UspAl or UspA2 comprising the steps of: 

a) providing said peptide; 

b) contacting said peptide with said antibody; and 

c) determining the binding of said antibody to said peptide. 

46. The method of claim 45, wherein said antibody is 17C7, 45-2, 13-1, 29-31, 16A7, 17B1 
or5C12. 

47. The method of claim 46, wherein said antibody is 17C7. 

48. The method of claim 46, wherein said antibody is 45-2. 

49. The method of claim 46, wherein said antibody is 13-1. 

50. The method of claim 46, wherein said antibody is 29-3 1. 

51. The method of claim 46, wherein said antibody is 16A7. 

52. The method of claim 46, wherein said antibody is 5C12. 
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53. The method of claim 46, wherein said antibody is 17B1. 



54. The method of claim 45, wherein said determining comprises an immunoassay selected 
from the group consisting of a western blot, an ELISA, and RIA and immunoaffmity 

separation. 



55. A method for screening a UspAl or UspA2 peptide for the ability to induce a protective 
immune response against M. catarrhalis comprising the steps of. 

a) providing said peptide; 

! o b ) administering a peptide in a suitable form to an experimental animal; 

c) challenging said animal with M. catarrhalis; and 

d) assaying the infection of said animal with M. catarrhalis. 

56. The method of claim 56, wherein said animal is a mouse, said challenging is a 
15 pulmonary challenge, and said assaying comprises assessing the degree of pulmonary 

clearance by said mouse. 



57. The method of claim 56, wherein said UspAl peptide encompasses about residues 
582-604 (SEQ ID NOT) of M catarrhalis or the analogous position thereof when 
20 compared to M. catarrhalis strain 03 5 E. 



58. The method of claim 56, wherein said UspA2 peptide encompasses about residues 
355-377 (SEQ ID NO:3) of M. catarrhalis or the analogous position thereof when 
compared to M. catarrhalis strain 035E. 
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59. The method of claim 57, wherein said UspAl peptide includes about residues 452-642 
(SEQ ID NO:l) of M. catarrhalis or the analogous positions thereof when compared to 
M. catarrhalis strain 035E. 

5 

60. The method of claim 58, wherein said UspA2 peptide includes about residues 242-415 
(SEQ ID NO:3) of M catarrhalis, or the analogous position thereof when compared to 
M. catarrhalis strain 035E. 



10 61. An isolated peptide having at least about 7 consecutive amino acids from the UspAl or 
UspA2 protein of M. catarrhalis, wherein said peptide includes residues located within 
the region defined by about residues 582-604 of said UspAl protein (SEQ ID NO:l), or 
by about residues 355-377 of said UspA2 protein (SEQ ID NO:3), or the analogous 
positions thereof when compared to strain 03 5 E. 

15 

62. The isolated peptide of claim 61, wherein said peptide is between 7 and 60 amino acids 
in length. 



63. The isolated peptide of claim 61, wherein said peptide comprises non-UspAl or 
20 non-UspA2 sequences. 



64. The isolated peptide of claim 61, wherein said peptide comprises non-M catarrhalis 
sequences. 



^vc/xxvrv <wo wnmiAo i > 
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65. An antigenic composition comprising 

a) an isolated peptide having at least about 7 consecutive amino acids from the 
UspAl or UspA2 protein of M. catarrhalis, wherein said amino acids include 
residues located within the region defined by about residues 582-604 of said 
TJspAl protein (SEQ ID NO:l), or by about residues 355-377 of said UspA2 
protein (SEQ ID NO:3), or the analogous positions thereof when compared to 
strain 035E. 

b) a pharmaceutically acceptable buffer or diluent. 

66. An antigenic composition comprising 

a) an isolated peptide of about 7 to about 60 amino acids comprising at least 7 
consecutive residues of the amino acid sequence of UspAl or UspA2 wherein 
said isolated peptide acts as a carrier covalently linked to a second antigen; and 

b) a pharmaceutically acceptable buffer or diluent. 

67. The antigenic composition of claim 66, wherein said second antigen is a peptide antigen. 

68. The antigenic composition of claim 66, wherein said second antigen is a non-peptide 
antigen. 
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DESCRIPTION 

i ISP At AND USPA2 ANTIGENS OF MOB AXF.LLA CATARRHALIS 
R ACKGROUND OF THE IN VENTION 

I. Field of the Invention 

The present invention relates generally to the fields of microbiology, and clinical 
bacteriology. More particularly, it concerns sequences of the uspAl and uspA2 genes which 
encode the proteins UspAl and UspA2. respectively, both of which encode an epitope reactive 
with monoclonal antibody (MAb) 17C7 and provide useful epitopes for immunodiagnosis and 
immunoprophylaxis. 



II. Description of Related Art 

It was previously thought that Moraxella catarrhal*, previously known as Branhamella 
catarrhalis or Neisseria catarrhalis. was a harmless saprophyte of the upper respiratory tract 
(Catlin, 1990; Berk, 1990). However, during the previous decade, it has been determined that 

1 5 this organism is an important human pathogen. Indeed, it has been established that this Gram- 
negative diplococcus is the cause of a number of human infections (Murphy, 1989). M. 
catarrhalis is now known to be the third most common cause of both acute and chronic otitis 
media (Catlin, 1990; Faden et ai. 1990; 1991; Marchant. 1990), the most common disease for 
which infants and children receive health care according to the 1989 Consensus Report. This 

20 organism also causes acute maxillary sinusitis, generalized infections of the lower respiratory 
tract (Murphy and Loeb, 1989) and is an important cause of bronchopulmonary infections in 
patients with underlying chronic lung disease and, less frequently, of systemic infections in 
immunocompromised patients (Melendez and Johnson, 1990; Sarubbi et al., 1990; Schonheyder 
and Ejlertsen, 1989; Wright and Wallace, 1989). 

25 The i9 8 9 Consensus Report further concluded that prevention of otitis media is an 

important health care goal due to both its occurrence in infants and children, as well as certain 
populations of all age groups. In fact, the total financial burden of otitis media has been 
estimated to be at least $2.5 billion annually. Vaccines were identified as the most desired 
approach to prevent this disease for a number of reasons. For example, it was estimated that if 
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vaccines could reduce the incidence of otitis media by 30%, then the annual health care savings 
would be at least $400 million. However, while some progress has been made in the 
development of vaccines for 2 of the 3 common otitis media pathogens, Streptococcus 
pneumoniae and Haemophilus influenzae, there is no indication that similar progress has been 
5 made with respect to M. catarrhalis. This is particularly troublesome in that M catarrhal is 
now accounts for approximately 17-20% of all otitis media infection (Murphy, 1989). In 
addition, M catarrhalis is also a significant cause of sinusitis (van Cauwenberge et al, 1993) 
and persistent cough (Gottfarb and Brauner, 1994) in children. In the elderly, it infects patients 
with predisposing conditions such as chronic obstructive pulmonary disease (COPD) and other 
10 chronic cardiopulmonary conditions (Boyle et al, 1991 ; Davies and Maesen, 1988; Hager et al., 
1987). 

Despite its recognized virulence potential, little is known about the mechanisms 
employed by M catarrhalis in the production of disease or about host factors governing 
immunity to this pathogen. An antibody response to M. catarrhalis otitis media has been 

15 documented by means of an ELISA system using whole M catarrhalis cells as antigen and 
acute and convalescent sera or middle ear fluid as the source of antibody (Leinonen et al, 
1981). The development of serum bactericidal antibody during M. catarrhalis infection in 
adults was shown to be dependent on the classical complement pathway (Chapman et al., 
1985). And more recently, it was reported that young children with M catarrhalis otitis media 

20 develop an antibody response in the middle ear but fail to develop a systemic antibody response 
in a uniform manner (Faden et al, 1992). 

Previous attempts have been made to identify and characterize M catarrhalis antigens 
that would serve as potentially important targets of the human immune response to infection 
(Murphy, 1989; Goldblatt et al, 1990; Murphy et a/., 1990). Generally speaking, the surface of 

25 M catarrhalis is composed of outer membrane proteins (OMPs), lipooligosaccharide (LOS) 
and fimbriae. M catarrhalis appears to be somewhat distinct from other Gram-negative 
bacteria in that attempts to isolate the outer membrane of this organism using detergent 
fractionation of cell envelopes has generally proven to be unsuccessful in that the procedures 
did not yield consistent results (Murphy, 1989; Murphy and Loeb, 1989). Moreover, 

30 preparations were found to be contaminated with cytoplasmic membranes, suggesting an 
unusual characteristic of the M catarrhalis cell envelope. 
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Passive immunization with polyclonal antisera raised against outer membrane vesicles 
of the M. catarrhal!* strain 035E was also found to protect against pulmonary challenge by the 
heterologous M. catarrhalis strain TTA24. In addition, active immunization with M. 
catarrhalis outer membrane vesicles resulted in enhanced clearance of this organism from the 
lungs after challenge. The positive effect of immunization in pulmonary clearance indicates 
that antibodies play a major role in immunoprotection from this pathogen. In addition, the 
protection observed against pulmonary challenge with a heterologous M. catarrhalis strain 
demonstrates that one or more conserved surface antigens are targets for antibodies which 
function to enhance clearance of M. catarrhalis from the lungs. 

Outer membrane proteins (OMPs) constitute major antigenic determinants of this 
unencapsulated organism (Bartos and Murphy, 1988) and different strains share remarkably 
similar OMP profiles (Bartos and Murphy, 1988; Murphy and Bartos, 1989). At least three 
different surface-exposed outer membrane antigens have been shown to be well-conserved among 
M. catarrhalis strains; these include the 81 kDa CopB OMP (Helminen et aL 1993b), the heat- 
modifiable CD OMP (Murphy et al, 1993) and the high-molecular weight UspA antigen 
(Helminen et al, 1994). Of these three antigens, both the CopB protein and UspA antigen have 
been shown to bind antibodies which exert biological activity against M catarrhalis in an animal 
model (Helminen et al., 1994; Murphy et al., 1993). 

The MAb, designated 17C7, was described as binding to UspA, a very high molecular 
weight protein that migrated with an apparent molecular weight (in SDS-PAGE) of at least 250 
kDa (Helminen et al., 1994; Klingman and Murphy, 1994). MAb 17C7 enhanced pulmonary 
clearance of M. catarrhalis from the lungs of mice when used in passive immunization studies 
and, in colony blot radioimmunoassay analysis, bound to every isolate of M. catarrhalis 
examined. This same MAb also reacted, although less intensely, with another antigen band of 
approximately 100 kDa. as described in U.S. Patent No. 5,552,146 (incorporated herein by 
reference). A recombinant bacteriophage that contained a fragment of M. catarrhalis 
chromosomal DNA that expressed a protein product that bound MAb 17C7 was also identified 
and migrated at a rate similar or indistinguishable from that of the native UspA antigen from M 
catarrhalis (Helminen et al., 1994). — 
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With the rising importance of this pathogen in respiratory tract infections, identification 
of the surface components of this bacterium involved in virulence expression and immunity is 
becoming more important. To date ; there are no vaccines available, against any other OMP, 
LOS or fimbriae, that induce protective antibodies against M. catarrhalis. Thus, it is clear that 
there remains a need to identify and characterize useful antigens and which can be employed in 
the preparation of immunoprophylactic reagents. Additionally, once such an antigen or 
antigens is identified, there is a need for providing methods and compositions which will allow 
the preparation of vaccines and in quantities that will allow their use on a wide scale basis in 
prophylactic protocols. 

SUMMARY OF THE INVENTION 

It is, therefore, an object of the present invention to provide new UspAl and UspA2 
proteins and genes coding therefor. It also is an object of the present invention to provide 
methods of using these new proteins, for example, in the preparation of agents for the treatment 
and inhibition of M catarrhalis infection. It also is contemplated that through the use of other 
technologies such as antibody treatment and immunoprophylaxis that one can inhibit or even 
prevent M. catarrhalis infections. 

In satisfying these goals, there are provided epitopic core sequences of UspAl and 
UspA2 which can serve as the basis for the preparation of therapeutic or prophylactic 
compositions or vaccines which comprise peptides of 7, 10, 20, 30, 40, 50 or even 60 amino 
acids in length that elicit an antigenic reaction and a pharmaceutically acceptable buffer or 
diluent. These peptides may be coupled to a carrier, adjuvant, another peptide or other 
molecule such that an effective antigenic response to M catarrhalis is retained or even 
enhanced. Alternatively, these peptides may act as carriers themselves when coupled to another 
peptide or other molecule that elicits an antigenic response to M catarrhalis or another 
pathogen. For example, UspA2 can serve as a carrier for an oligosaccharide. 

In one embodiment, the epitopic core sequences of UspAl and UspA2 comprise one or 
more isolated peptides of 7, 10, 20, 30, 40, 50 or even 60 amino acids in length having Jhe 
amino acid sequence AQQQDQH (SEQ ID NO:l 7). 
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In another embodiment, there are provided nucleic acids, uspAl and uspA2, which 
encode the UspAl and the UspA2 antigens, respectively, as well as the amino acid sequences of 
the UspAl and UspA2 antigens of the M. catarrhalis isolates 035E, TTA24, TTA37, and 
046E. It is envisioned that nucleic acid segments and fragments of the genes uspAl and uspA2 
and the UspAl and UspA2 antigens will be of value in the preparation and use of therapeutic or 
prophylactic compositions or vaccines for treating, inhibiting or even preventing M. catarrhalis 
infections. 

In another embodiment, there is provided a method for inducing an immune response in 
a mammal comprising the step of providing to the mammal an antigenic composition that 
comprises an isolated peptide of about 20 to about 60 amino acids that contains the identified 
epitopic core sequence and a pharmaceutical ly acceptable buffer or diluent. 

In another embodiment, there is provided a method for diagnosing M. catarrhalis 
infection which comprises the step of determining the presence, in a sample, of an M 
catarrhalis amino acid sequence corresponding to residues of the epitopic core sequences of 
either the UspAl or UspA2 antigen. This method may comprise PCR ™ detection of the 
nucleotide sequences or alternatively an immunologic reactivity of an antibody to either a 
UspAl or UspA2 antigen. 

In a further embodiment, there is provided a method for treating an individual having an 
M. catarrhalis infection which comprises providing to the individual an isolated peptide of 
about 20 to about 60 amino acids that comprises at least about 10 consecutive residues of the 
amino acid sequence identified as an epitopic core sequence of UspAl or UspA2. 

In a still further embodiment, there is provided a method for preventing or limiting an 
M. catarrhalis infection that comprises providing to a subject an antibody that reacts 
immunologically with the identified epitopic core region of either UspAl or UspA2 of M. 
catarrhalis. 

In another embodiment, there is provided a method for screening a peptide for reactivity 
with an antibody that binds immunologically to UspAl. UspA2 or both which comprises the 
steps of providing the peptide and contacting the peptide with the antibody and then 
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determining the binding of the antibody to the peptide. This method may comprise an 
immunoassay such as a western blot, an ELISA, an RIA or an immunoaffinity separation. 

In a still further embodiment, there is provided a method for screening a UspAl or 
UspA2 peptide for its ability to induce a protective immune response against M. catarrhalis by 
5 providing the peptide, administering it in a suitable form to an experimental animal, challenging 
the animal with M. catarrhalis and then assaying for an M. catarrhalis infection in the animal. 
It is envisioned that the animal used will.be a mouse that is challenged by a pulmonary 
exposure to M catarrhalis and that the assaying comprises assessing the degree of pulmonary 
clearance by the mouse. 

10 Other objects, features and advantages of the present invention will become apparent 

from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various changes and modifications within the spirit 
and scope of the invention will become apparent to those skilled in the art from this detailed 

15 description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings form part of the present specification and are included to further 
20 demonstrate certain aspects of the present invention. The invention may be better understood 
by reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

FIG. 1. Southern blot analysis of PvwII-digested chromosomal DNA from strains of M 
25 catarrhalis using a probe from the uspAl gene. Bacterial strain designations are at the top; 
kilobase (kb) position markers are on the left. 

FIG. 2A. Proteins present in whole cell lysates of the wild-type strain 035E and the 
isogenic uspAl mutant strain. These proteins were resolved by SDS-PAGE and stained with 
30 Coomassie blue. The left lane (WT) contains the wild-type strain and right lane (MUT) contains 
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the mutant. The arrows indicate the protein, approximately 120 kDa in size, that is present in the 
wild-type and missing in the mutant. Kilodalton position markers are on the left. 

FIG. 2B. Western blot analysis of whole cell lysates of the wild-type strain 035E and the 
isogenic uspAJ mutant strain. These proteins were resolved by SDS-PAGE and probed with MAb 
1 7C7 in western blot analysis. The left lane (WT) contains the wild-type strain and the right lane 
(MUT) contains the mutant. Kilodalton position markers are on the left. It can been seen that 
both strains possess the very high molecular weight band reactive with MAb 17C7 whereas only 
the wild-type strain also has a band of approximately 1 20 kDa that binds this MAb. 

FIG. 2C. Western blot analysis of whole cell lysate (WCL) and EDTA-extracted outer 
membrane vesicles (OMV) from the wild-type strain 035E (WT) and the isogenic uspA 1 mutant 
(MUT) using MAb 1 7C7. Samples were either heated at 37°C for 1 5 minutes (H) or at 1 00°C for 
5 minutes (B) prior to SDS-PAGE. Molecular weight position markers (in kilodaltons) are 
indicated on the left. The open arrow indicates the position of the very high molecular weight 
form of the MAb 17C7-reactive antigen; the closed arrow indicates the position of the 
approximately 120 kDa protein; the open circle indicates the position of the approximately 70-80 
kDa protein. 

FIG. 3. Southern blot analysis of chromosomal DNA from the wild-type M. calarrhalis 
strain 035E and the isogenic uspAl mutant. Chromosomal DNA was digested with PvuW and 
probed with a 0.6 kb BglH-Pvull fragment from the uspAl gene. The wild-type strain is listed as 
035E at the top of this figure and the mutant strain is listed as 035E-uspAl\ Kilobase position 
markers are present on the left side. 

FIG. 4. Western blot reactivity of proteins in M. calarrhalis strain 035E outer membrane 
vesicles (labeled 03 5E OMV) and the MF-4-1 GST fusion protein (labeled GST fusion protein) 
withMAbl7C7. 

FIG. 5. PCR™ products obtained by the use of the T3 and PI 0 primers (middle lane - 0.9 
kb product) and the T7 and P9 primers (right lane - 1.7 kb product) when used in a PCR™ 
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amplification with chromosomal DNA from the uspAl mutant. A kb ladder is present in the first 
lane; several kb position markers are listed on the left side of this figure. 

FIG. 6A-6C. SDS-PAGE and westerns of purified proteins. FIG. 6A. Coomassie blue 
5 stained gel of purified UspA2 (lane 2). FIG. 6B. Coomassie blue stained gel of purified UspAl 
prepared without heating of sample (lane 4), heated for 3 min at 100°C (lane 5), heated for 5 
min at 100°C (lane 6), and heated for 10 min at 100°C (lane 7). FIG. 6C. Western of the 
purified UspA2 (lane 9) and purified UspAl (lane 10) probed with the 17C7 MAb. Both 
proteins were heated 10 min. The molecular size markers in lanes 1, 3, and 8 are as indicated in 
10 kilodaltons. 

FIG. 7. Interaction of purified UspAl and UspA2 with HEp-2 cells as determined by 
ELISA. HEp-2 cell monolayers cultured in 96-well plate were incubated with serially diluted 
UspAl or UspA2. 035E bacterial strain was used as the positive control. The bacteria were 
diluted analogous to the proteins beginning with a suspension with an A 550 of 1.0. The bound 
15 proteins or attached bacteria were detected with a 1:1 mixed antisera to UspAl and UspA2 as 
described in the methods. 

FIG. 8. Interaction with fibronectin and vitronectin determined by dot blot. The bound 
vitronectin was detected with rabbit polyclonal antibodies, the protein bound to the fibronectin 
was detected with pooled sera made against the UspAl and UspA2. 

20 FIG. 9. The levels of antibodies to the protein UspAl ? UspA2 and M catarrhalis 035E 

strain in normal human sera. Data are the log l0 transformed end-point titers of the IgG (FIGs. 
9A-9C) and IgA (FIGs. 9D-9F) antibodies determined by ELISA. The individual titers were 
plotted according to age group and the geometric mean titer for each age group linked by a solid 
line. Sera for the 2-18 month old children were consecutive samples from a group of ten 

25 children. 

FIG. 10. Subclass distribution of IgG antibodies to UspAl and UspA2 in normal 
human sera. FIG. 10A shows titers toward UspAl and FIG. 10B shows titers to UspA2. 
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FIG. 11. Relationship of serum IgG titers to UspAl (FIG. HA) and UspA2 (FIG. 1 IB) 
with the bactericidal liter against the 035E strain determined by logistic regression (p<0.05). 
The solid line indicates the linear relationship between the IgG titer and bactericidal titer. 
Broken lines represent the 95 % confidence intervals of the linear fit. 

5 

FIG. 12. Schematic drawing showing the relative positions of decapeptides 10-24 
within the region of UspAl and UspA2 which binds to MAb 17C7. 

FIG. 13. Western dot blot analysis demonstrating reactivity of decapeptides 10-24 with 
MAb 17C7. 

10 FIG. 14. Partial restriction enzyme map of the uspAI (FIG. 14A) and uspAl (FIG. 14B) 

genes from M. catarrhalis strain 035E and the mutated versions of these genes. The shaded 
boxes indicate the open reading frame of each gene. Relevant restriction sites are indicated. 
PCR™ primer sites (P1-P6) are indicated by arrows. The DNA fragments containing the partial 
uspAl and uspA2 open reading frames that were derived from M. catarrhalis strain 035E 
15 chromosomal DNA by PCR™ and cloned into pBluescriptll SK+ are indicated by black bars. 
Dotted lines connect corresponding restriction sites on these DNA inserts and the chromosome. 
Open bars indicate the location of the kanamycin or chloramphenicol cassettes, respectively. 
The DNA probes specific for uspAl or uspA2 are indicated by the appropriate cross-hatched 
bars and were amplified by PCR™ from M. catarrhalis strain 035E chromosomal DNA by the 
20 use of the oligonucleotide primer pairs 

P3 (5'-GACGCTCAACAGCACTAATACG-3') (SEQ ID NO:20)/P4 

(5'-CCAAGCTGATATCACTACC-3') (SEQ ID NO:21) and 

P5 (5'-TCAATGCCTTTGATGGTC-3') (SEQ ID NO:22)/P6 

(5'-TGTATGCCGCTACTCGCAGCT-3') (SEQ ID NO:23), respectively. 

25 J . 

FIG. 15. Detection of the UspAl and UspA2 proteins in wild-type and mutant strains 

of M catarrhalis 035E. Proteins present in EDTA-cxtracted outer membrane vesicles from the 

wild-type strain (lane 1), the uspAl mutant strain 035E.1 (lane 2), the uspA2 mutant strain 

035E.2 (lane 3), and the isogenic J<s/vl 7 uspAl double mutant strain 035E.12 (lane 4) were 

30 resolved by SDS-PAGE, and either stained with Coomassie blue (FIG. 15 A) or transferred to 

nitrocellulose and probed with MAb 17C7 followed by radioiodinated goat anti-mouse 
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immunoglobulin in western blot analysis. In FIG. 15 A, the closed arrow indicates the very high 
molecular weight form of the UspA antigen which is comprised of both UspAl and UspA2. In 
FIG. 15B, the bracket on the left indicates the very high molecular weight forms of the UspAl 
and UspA2 proteins that bind MAb 17C7. The open arrow indicates the 120 kDa, putative 
monomeric form of UspAl. The closed arrow indicates the 85 kDa, putative monomeric form 
of UspA2. Molecular weight position markers (in kilodaltons) are present on the left. 

FIG. 16. Comparison of the rate and extent of growth of the wild-type and mutant 
strains of M catarrhalis. The wild-type strain 035E (closed squares), the uspAl mutant 
035E.1 (open squares), the uspA2 mutant 035E.2 (closed circles), and the uspAl uspA2 double 
mutant 035E.12 (open circles) of M. catarrhalis 035E from overnight broth cultures were 
diluted to a density of 35 Klett units in BHI broth and subsequently allowed to grow at 37° with 
shaking. Growth was followed by means of turbidity measurements. 

FIG. 17. Susceptibility of wild-type and mutant strains of M catarrhalis to killing by 
normal human serum. Cells of the wild-type parent strain 035E (diamonds), uspAl mutant 
035E.1 (triangles), uspA2 mutant 035E.2 (circles), and uspAl uspA2 double mutant 035E.12 
(squares) from logarithmic-phase BHI broth cultures were incubated in the presence of 10% 
(v/v) normal human serum (closed symbols) or heat-inactivated normal human serum (open 
symbols). Data are presented as the percentage of the original inoculum remaining at each time 
point. 



The present invention relates to the identification of epitopes useful for developing 
potential vaccines against M catarrhalis. Early work was directed at determining the 
molecular nature of the UspA antigen and characterize the epitope which is recognized by the 
MAb 17C7. Preliminary work indicated that MAb 17C7 recognizes a single antigenic epitope 
and it was believed that this epitope was encoded by a single gene. However, isolation ofthe 
protein which contained the epitope yielded unexpected results. MAb 17C7 recognized a single 
epitope, but the characteristics of the protein associated with the epitope suggested the existence 
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of not one but two separate proteins. Further careful analyses led to a surprising discovery. A 
single epitope of the UspA antigen is recognized by the MAb 17C7, but this epitope is present 
in two different proteins, UspAl and UspA2, which are encoded by two different genes uspAl 
and uspA2, respectively, and only have 43% identity to each other. The present invention 
provides the nucleotide sequences of the genes uspAl and uspA2, their respective protein 
products, UspAl and UspA2, and the shared epitope recognized by MAb 17C7. 

In addition, the present invention provides insights into the antigenic structure of the 
UspA protein based on the analysis of the sequences of the UspAl and UspA2 proteins which 
comprise the protein. Characterization of the epitopic region of the molecule that is targeted by 
the MAb 17C7 permits the development of agents that will be useful in protecting against M. 
catarrhalis infections, e.g. , in the preparation of prophylactic reagents. Particular embodiments 
relate to the amino acid and nucleic acids corresponding to the UspAl and UspA2 proteins, 
peptides and antigenic compositions derived therefrom, and methods for the diagnosis and 
treatment of M. catarrhalis disease. 

As stated previously, M. catarrhalis infections present a serious health challenge, 
especially to the young. Thus, there is a clear need to develop compositions and methods that 
will aid in the treatment and diagnosis of this disease. The present invention, by virtue of new 
information regarding the structure of the UspA antigen of M. catarrhalis, and discovery of the 
two new and distinct proteins UspAl and UspA2 provides such improved compositions and 
methods. UspAl and UspA2 represent important antigenic determinants, as the MAb 17C7 has 
been shown to protect experimental animals, as measured in a pulmonary clearance model, 
when provided in passive immunizations. 

In a first embodiment, the present invention provides for the identification of the 
proteins UspAl and UspA2 from M. catarrhalis strain 035E. The UspAl protein comprises 
about 831 amino acid residues and has a predicted mass of about 88,271 daltons (SEQ ID 
NO:l), The UspA2 protein comprises about 576 residues and has a predicted mass of about 
62,483 daltons (SEQ ID NO:3). UspA2 is not a truncated or processed form of UspAl. 

In a second embodiment, the present invention has identified the specific epitope_lo 
which MAb 17C7 binds. A common peptide sequence, designated as the "3Q" peptide, found 
between amino acid residues 480-502 and 582-604 of the UspAl protein (SEQ ID NO:l) and 



SUBSTITUTE SHEET (RULE 26) 



WO 98/28333 



12 



PCT/US97/23930 



residues 355-377 of the UspA2 protein (SEQ ID NO:3) of M catarrhalis strain 035E, 
encompasses the region which appears to be recognized by MAb 17C7. (Note that numbering 
of the amino acid residues is based upon strain 035E as provided in SEQ ID NO:3.) It is 
envisioned that this region plays an important role in the biology of the pathogen and, from this 
information, one will deduce amino acids residues that are critical in MAb 17C7 antibody 
binding. It also is envisioned that, based upon this information, one will be able to design 
epitopic regions that have either a higher or lower affinity for the MAb 17C7 or other 
antibodies. Further embodiments of the present invention are discussed below. 

In another preferred embodiment, the present invention provides DNA segments, 
vectors and the like comprising at least one isolated gene, DNA segment or coding region that 
encodes a M. catarrhalis UspAl or UspA2 protein, polypeptide, domain, peptide or any fusion 
protein thereof. Herein are provided at least an isolated gene, DNA segment or coding region 
that encodes a M. catarrhalis uspAl gene comprising about 2493 base pairs (bp) (SEQ ID 
NO:2) of strain 035E, about 3381 bp (SEQ ID NO:6) of strain 046E, about 3538 bp (SEQ ID 
NO: 10) of strain TTA24, or about 3292 bp (SEQ ID NO: 14) of strain TTA37. Further provided 
are at least an isolated gene, DNA segment or coding region that encodes a M catarrhalis 
uspA2 gene comprising about 1728 bp (SEQ ID NO:4) of strain 035E, about 3295 bp (SEQ ID 
NO:8) of strain 046E, about 2673 bp (SEQ ID NO:12), or about 4228 bp (SEQ ID NO:16) of 
strain TTA37. It is envisioned that the uspAl and uspA2 genes will be useful in the preparation 
of proteins, antibodies, screening assays for potential candidate drugs and the like to treat or 
inhibit, or even prevent, M catarrhalis infections. 

The present invention also provides for the use of the UspAl or UspA2 proteins or 
peptides as immunogenic carriers of other agents which are useful for the treatment, inhibition 
or even prevention of other bacterial, viral or parasitic infections. It is envisioned that either the 
UspAl or UspA2 antigen, or portions thereof, will be coupled, bonded, bound, conjugated or 
chemically-linked to one or more agents via linkers, polylinkers or derivatized amino acids such 
that a bispecific or multivalent composition or vaccine which is useful for the treatment, 
inhibition or even prevention of infection by M catarrhalis and another pathogen(s) is 
prepared. It is further envisioned that the methods used in the preparation of these compositions 
will be familiar to those of skill in the art and, for example, similar to those used to prepare 
conjugates to keyhole limpet hemocyannin (KLH) or bovine serum albumin (BSA). 
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It is important to note that screening methods for diagnosis and prophylaxis are readily 
available, as set forth below. Thus, the ability to (i) test peptides, mutant peptides and. 
antibodies for their reactivity with each other and (ii) test peptides and antibodies for the ability 
to prevent infections in vivo, provide powerful tools to develop clinically important reagents. 

1.0 UspA Proteins, Peptides and Polypeptides 

The present invention, in one embodiment, encompasses the two new protein sequences, 
UspAl and UspA2, and the peptide sequence AQQQDQH (SEQ ID NO: 17) identified as the 
target epitope of MAb 17C7. In addition, inspection of the amino acid sequences of the UspAl 
and UspA2 proteins from four strains of M. catarrhal indicated that each protein contained at 
least one copy of the peptide YEL AQQQDQH (SEQ ID NO: 18) which binds Mab 17C7 or, in 
one instance, a peptide nearly identical and having the amino acid sequence YDLAQQQDQH 
(SEQ ID NO: 19). 

The peptide (YELAQQQDQH, SEQ ID NO: 18) occurs twice in UspAl from strain 
035E at residues 486-495 and 588-597 (SEQ ID NO: 1) and once in UspA2 from strain 035E at 
residues 358-367 (SEQ ID NO:3). It occurs once in UspAl from strain TTA24 at residues 497- 
506 (SEQ ID NO:9) and twice in UspA2 from strain TTA24 at residues 225-234 and 413-422 
(SEQ ID NO:l 1). The peptide YDLAQQQDQH (SEQ ID NO: 19) occurs once in UspAl from 
strain 046E at residues 448-457 (SEQ ID NO:5) whereas the peptide YELAQQQDQH (SEQ 
ID NO: 18) occurs once in this same protein at residues 649-658 (SEQ ID NO:5). The peptide 
YELAQQQDQH (SEQ ID NO: 1 8) occurs once in UspA2 from strain 046E at residues 416-425 
(SEQ ID NO:7). The peptide YELAQQQDQH (SEQ ID NO: 18) occurs twice in UspAl from 
strain TTA37 at residues 478-487 and 630-639 (SEQ ID NO: 13) and twice in UspA2 from 
strain TTA37 at residues 522-53 1 and 68 1 -690 (SEQ ID NO: 1 5). 

Also encompassed in the present invention are hybrid molecules containing portions 
from one UspA protein, for example the UspAl protein, fused with portions of the other UspA 
protein, in this example the UspA2 protein, or fused with other proteins which are useful for 
identification, such as kanamycin-resistance, or other purposes in the screening of potential 
vaccines or further characterization of the UspAl and UspA2 proteins. For example, one may 
fuse residues 1-350 of any UspA l with residues 351-576 of any UspA2. Alternatively, a fusion 
could be generated with sequences from three, four or even five peptide regions represented in a 
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single UspA antigen. Also encompassed are fragments of the disclosed UspAl and UspA2 
molecules, as well as insertion, deletion or replacement mutants in which non-UspA sequences 
are introduced, UspA sequences are removed, or UspA sequences are replaced with non-UspA 
sequences, respectively. 

UspAl and UspA2 proteins, according to the present invention, may be advantageously 
cleaved into fragments for use in further structural or functional analysis, or in the generation of 
reagents such as UspA-related polypeptides and UspA-specific antibodies. This can be 
accomplished by treating purified or unpurified UspAl and/or UspA2 with a peptidase such as 
endoproteinase glu-C (Boehringer, Indianapolis, IN). Treatment with CNBr is another method by 
which UspAl and/or UspA2 fragments may be produced from their natural respective proteins. 
Recombinant techniques also can be used to produce specific fragments of UspAl or UspA2. 

More subtle modifications and changes may be made in the structure of the encoded 
UspAl or UspA2 polypeptides of the present invention and still obtain a molecule that encodes 
a protein or peptide with characteristics of the natural UspA antigen. The following is a 
discussion based upon changing the amino acids of a protein to create an equivalent, or even an 
improved, second-generation molecule. The amino acid changes may be achieved by changing 
the codons of the DNA sequence, according to the following codon table: 
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Amino acid names and 



Codons 



abbreviations 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 


Cysteine 


Cys 


C 


UGC 


UGU 






Aspartic acid 


Asp 


D 


GAC 


GAU 






Glutamic acid 


Glu 


E 


GAA 


uAu 






Phenylalanine 


Phe 


F 


uuc 


I TT TT T 
UUU 






Glycine 


Gly 


G 


GGA 




UUU 


UvJU 


Histidine 


His 


H 


CAC 


/"^ ATT 

CAU 






Isoleucine 


lie 


1 


AHA 

AUA 


AT TP 


AIR I 




Lysine 


Lys 


K 


AAA 


AAG 






Leucine 


Leu 


T 


UUA 


UUG 


CUA 


cue 


Methionine 


iVICL 


M 


AUG 








/\Spalaginc 


A <;n 


N 


AAC 


AAU 






Prrii i np 


Pro 


P 


CCA 


ccc 


CCG 


ecu 


Glutamine 


tjln 


r\ 
V 


PA A 


TAG 






Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


ucc 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 


Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 


Tryptophan 


Trp 


W 


UGG 








Tyrosine 


Tyr 


Y 


UAC 


UAU 







It is known that certain amino acids may be substituted for other amino acids in a 
protein structure in order to modify or improve its antigenic or immunogenic activity (see, ejj., 
Kyte & Doolittle, 1982; Hopp, U.S. patent 4,554,101, incorporated herein by reference). For 
example, through the substitution of alternative amino acids, small conformational changes may 
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be conferred upon a polypeptide which result in increased activity or stability. Alternatively, 
amino acid substitutions in certain polypeptides may be utilized to provide residues which may 
then be linked to other molecules to provide peptide-molecule conjugates which retain enough 
antigenicity of the starting peptide to be useful for other purposes. For example, a selected 
5 UspAl or UspA2 peptide bound to a solid support might be constructed which would have 
particular advantages in diagnostic embodiments. 

The importance of the hydropathic index of amino acids in conferring interactive 
biological function on a protein has been discussed generally by Kyte & Doolittle (1982), 
wherein it is found that certain amino acids may be substituted for other amino acids having a 

10 similar hydropathic index or core and still retain a similar biological activity. As displayed in 
Table II below, amino acids are assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics. It is believed that the relative hydropathic character 
of the amino acid determines the secondary structure of the resultant protein, which in turn 
defines the interaction of the protein with substrate molecules. Preferred substitutions which 

15 result in an antigenically equivalent peptide or protein will generally involve amino acids 
having index scores within ±2 units of one another, and more preferably Within ±1 unit, and 
even more preferably, within ±0.5 units. 

TABLE II 

Amino Acid Hydropathic Index 

Isoleucine 4.5 
Valine 4.2 
Leucine 3.8 
Phenylalanine 2.8 
Cysteine/cystine 2.5 
Methionine 1.9 
Alanine 1.8 
Glycine -0.4 
Threonine -0.7 
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Table II (Continued) 



Amino Acid 


Hydropathic 


Tryptophan 


-0.9 


Serine 


-0.8 


Tyrosine 


-1.3 


Proline 


-1.6 


Histidine 


-3.2 


Glutamic Acid 


-3.5 


Glutamine 


-3.5 


Aspartic Acid 


-3.5 


Asparagine 


-3.5 


Lysine 


-3.9 


Arginine 


-4.5 



10 



Thus, for example, isoleucine, which has a hydropathic index of +4.5, will preferably be 
exchanged with an amino acid such as valine (+ 4.2) or leucine (+ 3.8). Alternatively, at the 
other end of the scale, lysine (- 3.9) will preferably be substituted for arginine (-4.5), and so on. 

Substitution of like amino acids may also be made on the basis of hydrophilicity, 
particularly where the biological functional equivalent protein or peptide thereby created is 
intended for use in immunological embodiments. U.S. Patent 4,554,101, incorporated herein by 
reference, states that the greatest local average hydrophilicity of a protein, as governed by the 
hydrophilicity of its adjacent amino acids, correlates with its immurtogenicity and antigenicity, 
i.e. with an important biological property of the protein. 

As detailed in U.S. Patent 4,554,101, each amino acid has also been assigned a 
hydrophilicity value. These values are detailed below in Table III. 
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TABLE III 



Amino Acid 


Hydrophilic Index 


arginine 


+3.0 


lysine 


+3.0 


aspartate 


+3.0 ± 1 


glutamate 


+3.0 ± 1 


serine 


+0.3 


asparagine 


+0.2 


glutamine 


+0.2 


glycine 


0 


threonine 


-0.4 


alanine 


-0.5 


histidine 


-0.5 


proline 


-0.5 ± 1 


cysteine 


-1.0 


methionine 


-13 


valine 


-1.5 


leucine 


-1.8 


isoleucine 


-1.8 


tyrosine 


-2.3 


phenylalanine 


-2.5 


tryptophan 


-3.4 



It is understood that one amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent, and in particular, an 
5 immunologically equivalent protein. In such changes, the substitution of amino acids whose 
hydrophilicity values are within ±2 is preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more particularly preferred. 

Accordingly, these amino acid substitutions are generally based on the relative similarity 
of R-group substituents, for example, in terms of size, electrophilic character, charge, and the 
10 like. In general, preferred substitutions which take various of the foregoing characteristics into 
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consideration will be known to those of skill in the art and include, for example, the following 
combinations: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine 
and asparagine; and valine, leucine and isoleucine. 

In addition, peptides derived from these polypeptides, including peptides of at least 
5 about 6 consecutive amino acids from these sequences, are contemplated. Alternatively, such 
peptides may comprise about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 
50 51. 52, 53, 54, 55, 56, 57, 58, 59 or 60 consecutive residues. For example, a peptide that 
comprises 6 consecutive amino acid residues may comprise residues 1 to 6, 2 to 7, 3 to 8 and so 
10 on of the Usp A 1 or Usp A2 protein. Such peptides may be represented by the formula 

x to (x + n) = 5' to 3' the positions of the First and last consecutive residues 

where x is equal to any number from 1 to the full length of a UspAl or UspA2 protein and n is 
equal to the length of the peptide minus 1. So, for UspAl, x = 1 to 831, for UspA2, x = 1 to 
576. Where the peptide is 10 residues long (n = 10-1), the formula represents every 10-mer 
15 possible for each antigen. For example, where x is equal to 1 the peptide would comprise 
residues 1 to (1 + [10-1]), or 1 to 10. Where x is equal to 2, the peptide would comprise 
residues 2 to (2 + [ 1 0-2]), or 2 to 1 1 , and soon. 

Syntheses of peptides are readily achieved using conventional synthetic techniques such 
as the solid phase method (e.g., through the use of a commercially available peptide synthesizer 
20 such as an Applied Biosystems Model 430A Peptide Synthesizer). Peptides synthesized in this 
manner may then be aliquoted in predetermined amounts and stored in conventional manners, 
such as in aqueous solutions or, even more preferably, in a powder or lyophilized state pending 



use. 



In general, due to the relative stability of peptides, they may be readily stored in aqueous 
25 solutions for fairly long periods of time if desired, c - . up to six months or more, in virtually 
any aqueous solution without appreciable degradation or loss of antigenic activity. However, 
where extended aqueous storage is contemplated it will generally be desirable to include agents 
including buffers such as Tris or phosphate buffers to maintain a pH of 7.0 to 7.5. Moreover, it 
may be desirable to include agents which will inhibit microbial growth, such as sodium azide or 
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Merthiolate. For extended storage in an aqueous state it will be desirable to store the solutions 
at 4°C, or more preferably, frozen. Of course, where the peptide(s) are stored in a lyophilized 
or powdered state, they may be stored virtually indefinitely, e.g., in metered aliquots that may 
be rehydrated with a predetermined amount of water (preferably distilled, deionized) or buffer 
prior to use. 

Of particular interest are peptides that represent epitopes that lie within the UspA 
antigen and are encompassed by the UspAl and UspA2 proteins of the present invention. An 
"epitope" is a region of a molecule that stimulates a response from a T-cell or B-cell, and hence, 
elicits an immune response from these cells. An epitopic core sequence, as used herein, is a 
relatively short stretch of amino acids that is structurally "complementary" to, and therefore will 
bind to, binding sites on antibodies or T-cell receptors. It will be understood that, in the context 
of the present disclosure, the term "complementary" refers to amino acids or peptides that 
exhibit an attractive force towards each other. Thus, certain epitopic core sequences of the 
present invention may be operationally defined in terms of their ability to compete with or 
perhaps displace the binding of the corresponding UspA antigen to the corresponding UspA- 
directed antisera. 

The identification of epitopic core sequences is known to those of skill in the art. For 
example U.S. Patent 4,554,101 teaches identification and preparation of epitopes from amino 
acid sequences on the basis of hydrophilicity, and by Chou-Fasman analyses. Numerous 
computer programs are available for use in predicting antigenic portions of proteins, examples 
of which include those programs based upon Jameson- Wolf analyses (Jameson and Wolf, 1988; 
Wolfe/ al., 1988), the program PepPlot® (Brutlag et aL, 1990; Weinberger et ai 9 1985), and 
other new programs for protein tertiary structure prediction (Fetrow & Bryant, 1993) that can be 
used in conjunction with computerized peptide sequence analysis programs. 

In general, the size of the polypeptide antigen is not believed to be particularly crucial, 
so long as it is at least large enough to carry the identified core sequence or sequences. The 
smallest useful core sequence anticipated by the present disclosure would be on the order of 
about 6 amino acids in length. Thus, this size will generally correspond to the smallest peptide 
antigens prepared in accordance with the invention. I lowever, the size of the antigen may"be 
larger where desired, so long as it contains a basic epitopic core sequence. 
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2.0 UspAl and UspA2 Nucleic Acids 

In addition to polypeptides, the present invention also encompasses nucleic acids 
encoding the UspAl (SEQ ID NO:2, SEQ ID NO:6, SEQ ID NO:10 and SEQ ID NO:14) and 
UspA2 (SEQ ID NO:4 ? SEQ ID NO:8, SEQ ID NO: 12 and SEQ ID NO:16) proteins from the 

5 exemplary M. catarrhal strains 035E, 046E, TTA24 and TTA37, respectively. Because of 
the degeneracy of the genetic code, many other nucleic acids also may encode a given UspAl or 
UspA2 protein. For example, four different three-base codons encode the amino acids alanine, 
glycine, proline, threonine and valine, while six different codons encode arginine, leucine and 
serine. Only methionine and tryptophan are encoded by a single codon. Table I provides a list of 

10 amino acids and their corresponding codons for use in such embodiments. In order to generate 
any nucleic acid encoding UspAl or UspA2, one need only refer to the codon table provided 
herein. Substitution of the natural codon with any codon encoding the same amino acid willresult 
in a distinct nucleic acid that encodes UspAl or UspA2. As a practical matter, this can be 
accomplished by site-directed mutagenesis of an existing uspAl or uspAl gene or de novo 

1 5 chemical synthesis of one or more nucleic acids. 

These observations regarding codon selection, site-directed mutagenesis and chemical 
synthesis apply with equal force to the discussion of substitutional mutant UspAl or UspA2 
peptides and polypeptides, as set forth above. More specifically, substitutional mutants generated 
by site-directed changes in the nucleic acid sequence that are designed to alter one or more codons 

20 of a given polypeptide or epitope may provide a more convenient way of generating large 
numbers of mutants in a rapid fashion. The nucleic acids of the present invention provide for a 
simple way to generate fragments (e.g., truncations) of UspAl or UspA2, UspAl -UspA2 fusion 
molecules (discussed above) and UspAl or UspA2 fusions with other molecules. For example, 
utilization of restriction enzymes and nuclease in the uspAl or uspA2 gene permits one to 

25 manipulate the structure of these genes, and the resulting gene products. 

The nucleic acid sequence information provided by the present disclosure also allows 
for the preparation of relatively short DNA (or RNA) sequences that have the ability to 
specifically hybridize to gene sequences of the selected uspA I or uspA2 gene. In these aspects 
nucleic acid probes of an appropriate length are prepared based on a consideration of the coding 
30 sequence of the uspAJ or usp A 2 gene, or flanking regions near the uspAl or uspA2 gene, such as 
regions downstream and upstream in the M. catarrhalis chromosome. The ability of such 
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nucleic acid probes to specifically hybridize to either uspAI or uspAl gene sequences lends 
them particular utility in a variety of embodiments. For example, the probes can be used in a 
variety of diagnostic assays for detecting the presence of pathogenic organisms in a given 
sample. In addition, these oligonucleotides can be inserted, in frame, into expression constructs 
5 for the purpose of screening the corresponding peptides for reactivity with existing antibodies 
or for the ability to generate diagnostic or therapeutic reagents. 

To provide certain of the advantages in accordance with the invention, the preferred 
nucleic acid sequence employed for hybridization studies or assays includes sequences that are 
complementary to at least a 10 to 20, or so, nucleotide stretch of the sequence, although 

10 sequences of 30 to 60 or so nucleotides are also envisioned to be useful. A size of at least 9 
nucleotides in length helps to ensure that the fragment will be of sufficient length to form a 
duplex molecule that is both stable and selective. Though molecules having complementary 
sequences over stretches greater than 10 bases in length are generally preferred, in order to 
increase stability and selectivity of the hybrid, and thereby improve the quality and degree of 

15 the specific hybrid molecules obtained. Thus, one will generally prefer to design nucleic acid 
molecules having either uspAl or usp A 2 gene-complementary stretches of 15 to 20 nucleotides, 
or even longer, such as 30 to 60, where desired. Such fragments may be readily prepared by, 
for example, directly synthesizing the fragment by chemical means, by application of nucleic 
acid reproduction technology, such as the PCR™ technology of U.S. Patent 4,603,102, or by 

20 introducing selected sequences into recombinant vectors for recombinant production. 

The probes that would be useful may be derived from any portion of the sequences of SEQ 
ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID 
NO: 12 or SEQ ID NO: 14 or SEQ ID NO: 16. Therefore, probes are specifically contemplated that 
comprise nucleotides 1 to 9, or 2 to 10, or 3 to 1 1 and so forth up to a probe comprising the last 9 

25 nucleotides of the nucleotide sequence of SEQ ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or 
SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID NO: 12 or SEQ ID NO: 14 or SEQ ID NO: 16. Thus, 
each probe would comprise at least about 9 linear nucleotides of the nucleotide sequence of SEQ 
ID NO:2 or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID 
NO:12 or SEQ ID NO:14 or SEQ ID NO:16., designated by the formula "n to n + 8," where njs 

30 an integer from 1 to the number of nucleotides in the sequence. Longer probes that hybridize to 
the uspAl or uspA2 gene under low, medium, medium-high and high stringency conditions are 
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also contemplated, including those that comprise the entire nucleotide sequence of SEQ ID NO:2 
or SEQ ID NO:4 or SEQ ID NO:6 or SEQ ID NO:8 or SEQ ID NO: 10 or SEQ ID NO: 12 or SEQ 
ID NO: 14 or SEQ ID NO: 16. This hypothetical may be repeated for probes having lengths of 
about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 and 
5 greater bases. 

In that the UspA antigenic epitopes of the present invention are believed to be indicative 
of pathogenic Moraxella species as exemplified by strains 035E, 046E, TTA24 and TTA37, 
the probes of the present invention will find particular utility as the basis for diagnostic 
hybridization assays for detecting UspAl or UspA2 DNA in clinical samples. Exemplary 

10 clinical samples that can be used in the diagnosis of infections are thus any samples which 
could possibly include Moraxella nucleic acid, including middle ear fluid, sputum, mucus, 
bronchoalveolar fluid, amniotic fluid or the like. A variety of hybridization techniques and 
systems are known which can be used in connection with the hybridization aspects :of the 
invention, including diagnostic assays such as those described in Falkow et al, U.S. Patent 

15 4,358,535. Depending on the application envisioned, one will desire to employ varying 
conditions of hybridization to achieve varying degrees of selectivity of the probe toward the 
target sequence. For applications requiring a high degree of selectivity, one will typically desire 
to employ relatively stringent conditions to form the hybrids, for example, one will select 
relatively low salt and/or high temperature conditions, such as provided by 0.02M-0.15M NaCl 

20 at temperatures of 50°C to 70°C. These conditions are particularly selective, and tolerate little, 
if any, mismatch between the probe and the template or target strand. 

Of course, for some applications, for example, where one desires to prepare mutants 
employing a mutant primer strand hybridized to an underlying template, less stringent 
hybridization conditions are called for in order to allow formation of the heteroduplex. In these 

25 circumstances, one would desire to employ conditions such as 0.15M-0.9M salt, at temperatures 
ranging from 20°C to 55°C. In any case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing amounts of formamide, which serves to 
destabilize the hybrid duplex in the same manner as increased temperature. Thus, hybridization 
conditions can be readily manipulated, and the method of choice will generally depend onjhe 

30 desired results. 



SUBSTITUTE SHEET (RULE 25) 



WO 98/28333 




PCT/US97/23930 



In certain embodiments, one may desire to employ nucleic acid probes to isolate variants 
from clone banks containing mutated clones. In particular embodiments, mutant clone colonies 
growing on solid media which contain variants of the UspAl and/or UspA2 sequence could be 
identified on duplicate filters using hybridization conditions and methods, such as those used in 
colony blot assays, to obtain hybridization only between probes containing sequence variants 
and nucleic acid sequence variants contained in specific colonies. In this manner, small 
hybridization probes containing short variant sequences of either the uspAl or uspA2 gene may 
be utilized to identify those clones growing on solid media which contain sequence variants of 
the entire uspAl or uspA2 gene. These clones can then be grown to obtain desired quantities of 
the variant UspAl or UspA2 nucleic acid sequences or the corresponding UspA antigen. 

In clinical diagnostic embodiments, nucleic acid sequences of the present invention are 
used in combination with an appropriate means, such as a label, for determining hybridization. 
A wide variety of appropriate indicator means are known in the art, including radioactive, 
enzymatic or other ligands, such as avidin/biotin, which are capable of giving a detectable 
signal. In preferred diagnostic embodiments, one will likely desire to employ an enzyme tag 
such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other 
environmental undesirable reagents. In the case of enzyme tags, colorimetric indicator 
substrates are known which can be employed to provide a means visible to the human eye or 
spectrophotometrically, to identify specific hybridization with pathogen nucleic acid-containing 
samples. 

In general, it is envisioned that the hybridization probes described herein will be useful 
both as reagents in solution hybridizations as well as in embodiments employing a solid phase. 
In embodiments involving a solid phase, the test DNA (or RNA) from suspected clinical 
samples, such as exudates, body fluids {e.g., amniotic fluid, middle ear effusion, 
bronchoalveolar lavage fluid) or even tissues, is adsorbed or otherwise affixed to a selected 
matrix or surface. This fixed, single-stranded nucleic acid is then subjected to specific 
hybridization with selected probes under desired conditions. The selected conditions will 
depend on the particular circumstances based on the particular criteria required (depending, for 
example, on the G+C contents, type of target nucleic acid, source of nucleic acid, size_of 
hybridization probe, etc.). Following washing of the hybridized surface so as to remove 
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nonspecifically bound probe molecules, specific hybridization is detected, or even quantified, 
by means of the label. 

The nucleic acid sequences which encode for the UspAl and/or UspA2 epitopes, or their 
variants, may be useful in conjunction with PCR™ methodology to detect M catarrhalis. In 
5 general, by applying the PCR™ technology as set out, e.g., in U.S. Patent 4,603,102, one may 
utilize various portions of either the uspAl or uspA2 sequence as oligonucleotide probes for the 
PCR™ amplification of a defined portion of a uspAl or uspAl nucleic acid in a sample. The 
amplified portion of the uspAl or uspA2 sequence may then be detected by hybridization with a 
hybridization probe containing a complementary sequence. In this manner, extremely small 
10 concentrations of M catarrhalis nucleic acid may detected in a sample utilizing uspAl or nspA2 
sequences. 

3.0 Vectors, Host Cells and Cultures for Producing UspAl and/or UspA2 Antigens 

In order to express a UspAl and/or UspA2 polypeptide, it is necessary to provide an 
uspAl and/or uspA2 gene in an expression cassette. The expression cassette contains a UspAl 

15 and/or UspA2-encoding nucleic acid under transcriptional control of a promoter. A "promoter" 
refers to a DNA sequence recognized by the synthetic machinery of the cell, or introduced 
synthetic machinery, required to initiate the specific transcription of a gene. The phrase "under 
transcriptional control" means that the promoter is in the correct location and orientation in 
relation to the nucleic acid to control RNA polymerase initiation and expression of the gene. 

20 Those promoters most commonly used in prokaryotic recombinant DNA construction include 
the B-lactamase (penicillinase) and lactose promoter systems (Chang et aL, 1978; Itakura et al. % 
1977; Goeddel et al., 1979) and a tryptophan (trp) promoter system (Goeddel et aL, 1980; EPO 
Appl. Publ. No. 0036776). While these are the most commonly used, other microbial 
promoters have been discovered and utilized, and details concerning their nucleotide sequences 

25 have been published, enabling a skilled worker to ligate them functionally with plasmid vectors 
(EPO Appl. Publ. No. 0036776). Additional examples of useful promoters are provided in 
Table IV below. 
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TABLE IV 



Promoters 


References 


Immunoglobulin Heavy Chain 


Hanerji et al. 1983; Gilles et al, 1983; 
Grosschedl and Baltimore, 1985; Atchinson 
and Perry, 1986, 1987; Imler et al, 1987; 
Weinberger et al, 1988; Kiledjian et al., 1988; 
Porton etal, 1990 


Immunoglobulin Light Chain 


Queen and Baltimore, 1983; Picard and 
Schaffner, 1984 


T-Cell Receptor 


Luriae/'tf/., 1987, Winoto and Baltimore, 1989; 
Redondo etal, 1990 


HLA DQ a and DQ ft 


Sullivan and Peterlin, 1987 


B-Interferon 


Goodbourn et al, 1986; Fujita et al, 1987; 
Goodbourn and Maniatis, 1 985 


Interleukin-2 


Greene et al, 1989 


Interleukin-2 Receptor 


Greene et al, 1989; Lin et al, 1990 


MHC Class II 5 


Koch etal, 1989 


MHC Class II HLA-DRa 


Sherman etal, 1989 


B-Actin 


Kawamoto et al. , 1 988; Ng et al. , 1 989 


Muscle Creatine Kinase 


Jaynes et al, 1988; Horlick and Benfield, 1989; 
Johnson etal, 1989a 


Prealbumin (Transthyretin) 


Costa etal, 1988 


Elastase / 


Omitze/a/., 1987 


Metallothionein 


Karin et al, 1987; Culotta and Hamer, 1989 


Collagenase 


Pinkert et al. 1987; Angel et al, 1987 


Albumin Gene 


Pinkerte/ al. 1 987, Tronche etal, 1989, 1990 



SUBSTITUTE SHEET (RULE 26) 



WO 98/2S333 




PCT/US97/23930 



TABLE IV (Continued) 



Promoters 


References 


a-Fetoprotein 


Godbout et aL, 1988; Campere and Tilghman, 
1989 


t-Globin 


Bodine and Ley, 1987; Perez-Stable and 
Constantim, 1990 


B-Globin 


Trudel and ConstantinL 1987 


e-fos 


Cohen et aL, 1987 


c-HA-ras 


Tnesman, 1986; Deschamps et aL, 1985 


Insulin 


Edlundef aL, 1985 


Neural Cell Adhesion Molecule 
(NCAM) 


Hirschefa/., 1990 


^l-Anliirypain 


Latimer et aL, 1990 


H2B (TH2B) Histone 


Hwang etal., 1990 


Mouse or Type I Collagen 


Ripe etal., 1989 


Glucose-Regulated Proteins 
(GRP94 and GRP78) 


Change/ a/., 1989 


Rat Growth Hormone 


r I | AO / 

Larsen e/ a/., 1986 


Human Serum Amyloid A (SAA) 


Edbrooke et aL, 1989 


Troponin I (TN I) 


Yutzey <?/ aL. 1989 


Platelet-Derived Growth Factor 


Pechc/a/., 1989 


Duchenne Muscular Dystrophy 


Klamut et aL, 1990 


SV40 


Banerji e/ aL, 1981; Moreau <?/ a/., 1981; Sleigh 
and Lockett, 1985; rirak and suoramanian, 
1986; Herr and Clarke, 1986; Imbra and Karin, 
1986; Kadesch and Berg, 1986; Wang and 
Calame, 1986; Ondek et aL, 1987; Kuhl et aL, 
1987 Schaffner etal., 1988 
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TABLE IV (Continued) 



Promoters 


References 


Polyoma 


Svvartzendruber and Lehman, 1975; Vasseur et 
al, 1980; Katinka et al, 1980, 1981; Tyndell et 
al, 1981; Dandolo et al, 1983; deVilliers et 
al, 1984; Hen et al, 1986; Satake et al, 1988; 
Campbell and Villarreal, 1988 


Retroviruses 


ICriegler and Botchan, 1982, 1983; Levinson et 
al., 1982; Kriegler et al , 1983, 1984a,b, 1988; 
Bosze etal. 9 1986; Miksicek et al, 1986; 
Celander and Haseltine, 1987; Thiesen et al, 
1988; Celander et al, 1988; Choi etal, 1988; 
Reisman and Rotter, 1989 


Papilloma Virus 


Campo et al 9 1983; Lusky et al, 1983; 
Spandidos and Wilkie, 1983; Spalholz et al, 
1985; Lusky and Botchan, 1986; Cripe et al, 
1987; Gloss et al, 1987; Hirochika etal, 1987, 
Stephens and Hentschel, 1987; Glue et al, 
1988 


Hepatitis B Virus 


Bulla and Siddiqui, 1986; Jameel and Siddiqui, 
1986; Shaul and Ben-Levy, 1987; Spandau and 
Lee, 1988; Vannice and Levinson, 1988 


Human Immunodeficiency Virus 


Muesing et al, 1987; Hauber and Cullan, 1988; 
Jakobovits et al, 1988; Feng and Holland, 
1988;Takebe et al, 1988; Rowene/a/., 1988; 
Berkhout etal., 1989; Laspiae/tf/., 1989; 
Sharp and Marciniak, 1989; Braddock et al., 
1989 


Cytomegalovirus 


Weber et al. , 1 984; Boshart et al. , 1 985; 
Foecking and Mofstetter, 1986 


Gibbon Ape Leukemia Virus 


Holbrook et al., 1987; Quinn et al., 1989 
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